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Abstract. This is a graduate-level introduction to graph theory, 
corresponding to a quarter-long course. It covers simple graphs, 
multigraphs as well as their directed analogues, and more restrictive 
classes such as tournaments, trees and arborescences. Among the 
features discussed are Eulerian circuits, Hamiltonian cycles, span- 
ning trees, the matrix-tree and BEST theorems, proper colorings, 
Turan’s theorem, bipartite matching and the Menger and Gallai- 
Milgram theorems. The basics of network flows are introduced in 
order to prove Hall’s marriage theorem. 

Around a hundred exercises are included (without solutions). 
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1. Preface 


1.1. What is this? 


This is a course on graphs — a rather elementary concept (actually a cluster of 
closely related concepts) that can be seen all over mathematics. We will discuss 
several kinds of graphs (simple graphs, multigraphs, directed graphs, etc.) and 
study their features and properties. In particular, we will encounter walks on 
graphs, matchings of graphs, flows on networks (networks are graphs with 
extra data), and take a closer look at certain types of graphs such as trees and 
tournaments. 

The theory of graphs goes back at least to Leonhard Euler, who in a 1736 
paper [Euler36] (see [Euler53] for an English translation) solved a puzzle about 
an optimal tour of the town of Königsberg. It saw some more developments in 
the 19th century and straight-up exploded in the 20th; now it is one of the most 
active fields of mathematics. There are now dozens (if not hundreds) textbooks 
available on the subject, such as 


e the comprehensive works [BonMur08], [Berge91], [Ore74 , [Bollob98], 
[Dieste17], [(ChLeZh16}, JungniT3] 


e or the more introductory [Ore96], [BenWil06| Chapters 5-6], [Bollob71], 
[Griffi21], oe [Guichal6] Seale [Harary69], Eare 
[|HaHiMo08 Sy oe. 1], en ee gee ne 10- 
13], [Ruohon13| 3], [KelTro17], (LoPeve03 [LoPeVe03], [West01], [Verstr21], [HarRin03]. 


These texts are written at different levels of sophistication, rigor and detail, are 
tailored to different audiences, and (beyond the absolute basics) often cover 
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different ground (for instance, distinguishes itself by treating infinite 
and random graphs, whereas is strong on applications). 

The present notes are self-contained and do not follow any existing book. 
Nevertheless, I recommend skimming the texts cited above to gain a wider 
perspective on graph theory (far beyond what we can cover in an introductory 
course), and perhaps marking the one or the other book for later reading. Our 
focus in these notes is on the more discrete and algebraic sides of graph theory 
(finite graphs of various kinds, existential results, counting formulas), and they 
are limited both by the time constraints (being written for a quarter-long course) 
and the limits of my own knowledge. 


1.1.1. Remarks 


Prerequisites. These notes target a graduate-level (or advanced undergraduate) 
reader. A certain mathematical sophistication and willingness to think along 
(as well as invent one’s own examples) is expected. Beyond that, the main 
prerequisites are the basic properties of determinants, polynomials and finite 
sums. Rings and fields are occasionally mentioned, but the reader can make do 
with just the most basic examples thereof (Q, R, polynomial rings and matrix 
rings; also the finite field FF in a few places). No analysis (or even calculus) is 
required anywhere in this text. 


Course websites. These notes were written for my Math 530 course at Drexel 
University in Spring 2022. The website of this course can be found at 


https://www.cip.ifi.lmu.de/~grinberg/t/22s . 


An older, but similarly structured course is my Spring 2017 course at the Uni- 
versity of Minnesota. Its website is available at 


https://www.cip.ifi.lmu.de/~grinberg/t/17s , 


and contains some additional materials (such as solutions to some selected 
exercises, a few more detailed topics, and a|stub of a text! that covers parts 
of our Chapter [2|in more depth). If you are reading the present notes on the 
arXiv, then said additional materials can also be found as ancillary files to this 
arXiv submission. 


Exercises. These notes include exercises of varying difficulty and signifi- 
cance. Almost all of the exercises are optional (i.e., they are not used anywhere 
in the text, except perhaps in other exercises), but they often provide practice, 
context and additional inspiration. Naturally, one person’s inspiration is an- 
other’s distraction, so I do not recommend assigning too much importance to 
any specific exercise; it is usually better to read on than to dwell for hours. 
However, a dozen minutes of thought per exercise will likely not be a waste of 
time. 
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Acknowledgments. I have learned a lot from conversations with Joel Brew- 
ster Lewis, Lukas Katthén and Victor Reiner. Chiara Libera Carnevale and 
Amanda Johnson corrected errors in previous versions of these notes. I am 
indebted to all of the above, and would appreciate any further input — please 
contact darijgrinberg@gmail.com about any corrections (however small) and 
suggestions. 


1.2. Notations 


The following notations will be used throughout these notes: 
e We let N = {0,1,2,...}. Thus, 0 € N. 
e The size (i.e., cardinality) of a finite set S is denoted by |S]. 


e If S is a set, then the powerset of S means the set of all subsets of S. This 
powerset will be denoted by P (S). 


Moreover, if S is a set, and k is an integer, then P} (S) will mean the set of 
all k-element subsets of S. For instance, 


P2 ({1,2,3}) = {{1,2}, {1,3}, {2,3}}. 


e For any number n and any k € N, we define the binomial coefficient 


és to be the number 


na-da) n-k AD 


k! k! 


These binomial coefficients have many interesting properties, which can 
often be found in textbooks on enumerative combinatorics (e.g., 
Chapter 2]). Some of the most important ones are the following: 

n! 
= kl- (n= k)! 
- The combinatorial interpretation: If n,k € N, and if S is an n-element 


— The factorial formula: If n,k € N and 7n > k, then (a) 


set, then (x) is the number of all k-element subsets of S (in other 


words, |P; (S)| = (i) ). 


— Pascal’s recursion: For any number n and any positive integer k, we 


(e077) 
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2. Simple graphs 


2.1. Definitions 


The first type of graphs that we will consider are the “simple graphs”, named 
so because of their very simple definition: 


Definition 2.1.1. A simple graph is a pair (V, E), where V is a finite set, and 
where E is a subset of P2 (V). 


To remind, Pz (V) is the set of all 2-element subsets of V. Thus, a simple 
graph is a pair (V,E), where V is a finite set, and E is a set consisting of 2- 
element subsets of V. We will abbreviate the word “simple graph” as “graph” 
in this chapter, but later (in Chapter B) we will learn some more advanced and 
general notions of “graphs”. 


Example 2.1.2. Here is a simple graph: 


({12,3,4}, {{13}, {L4}, {3,4}}). 


Example 2.1.3. For any n € N, we can define a simple graph Cop, to be the 
pair (V,E), where V = {1,2,...,n} and 


E = {{u,v} € Po(V) | gcd (u,v) =1}. 
We call this the n-th coprimality graph. 


(Some authors do not require V to be finite in Definition this leads to 
infinite graphs. But I shall leave this can of worms closed for this quarter.) 


The purpose of simple graphs is to encode relations on a finite set — specif- 
ically the kind of relations that are binary (i.e., relate pairs of elements), sym- 
metric (i.e., mutual) and irreflexive (i.e., an element cannot be related to itself). 
For example, the graph Cop, in Example encodes the coprimality (aka 
coprimeness) relation on the set {1,2,...,n}, except that the latter relation is 
not irreflexive (1 is coprime to 1, but {1,1} is not in E; thus, the graph Cop,, 
“forgets” that 1 is coprime to 1). For another example, if V is a set of people, 
and E is the set of {u,v} € P2(V) such that u has been married to v at some 
point, then (V, E) is a simple graph. Even in 2022, marriage to oneself is not a 
thing, so all marriages can be encoded as 2-element subsets|! 


The following notations provide a quick way to reference the elements of V 
and E when given a graph (V, E): 


lThe more standard example for a social graph would be a “friendship graph”; here, V is 
again a set of people, but E is now the set of {u,v} € P2 (V) such that u and v are friends. 
Of course, this only works if you think of friendship as being automatically mutual (true 
for facebook friendship, questionable for the actual thing). 
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Definition 2.1.4. Let G = (V,E) bea simple graph. 


(a) The set V is called the vertex set of G; it is denoted by V(G). (Notice 
that the letter “V” in “V (G})” is upright, as opposed to the letter “V” 
in “(V,E)”, which is italic. These are two different symbols, and have 
different meanings: The letter V stands for the specific set V which is 
the first component of the pair G, whereas the letter V is part of the 
notation V (G) for the vertex set of any graph. Thus, if H = (W,F) is 
another graph, then V (H) is W, not V.) 


The elements of V are called the vertices (or the nodes) of G. 


(b) The set E is called the edge set of G; it is denoted by E(G). (Again, the 

letter “E” in “E(G)” is upright, and stands for a different thing than 
the “E”.) 
The elements of E are called the edges of G. When u and v are two 
elements of V, we shall often use the notation uv for {u,v}; thus, each 
edge of G has the form uv for two distinct elements u and v of V. Of 
course, we always have uv = vu. 


Notice that each simple graph G satisfies G = (V (G),E(G)). 
(c 


Å 


Two vertices u and v of G are said to be adjacent (to each other) if 
uv € E (that is, if uv is an edge of G). In this case, the edge uv is said 
to join u with v (or connect u and v); the vertices u and v are called 
the endpoints of this edge. When the graph G is not obvious from the 
context, we shall often say “adjacent in G” instead of just “adjacent”. 


Two vertices u and v of G are said to be non-adjacent (to each other) if 
they are not adjacent (i.e., if uv ¢ E). 


(d 


Å 


Let v be a vertex of G (that is, v € V). Then, the neighbors of v (in 
G) are the vertices u of G that satisfy vu € E. In other words, the 
neighbors of v are the vertices of G that are adjacent to v. 


Example 2.1.5. Let G be the simple graph 
(ALZ bth Tete (pAb A 
from Example 2.1.2] Then, its vertex set and its edge set are 
V (G) = {1,2,3,4} and E(G) = {{1,3}, {1,4}, {3,4}} = {13, 14, 34} 


(using our notation uv for {u,v}). The vertices 1 and 3 are adjacent (since 
13 € E(G)), but the vertices 1 and 2 are not (since 12 ¢ E(G)). The neighbors 
of 1 are 3 and 4. The endpoints of the edge 34 are 3 and 4. 
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2.2. Drawing graphs 


There is a common method to represent graphs visually: Namely, a graph can 
be drawn as a set of points in the plane and a set of curves connecting some of 
these points with each other. 

More precisely: 


Definition 2.2.1. A simple graph G can be visually represented by drawing 
it on the plane. To do so, we represent each vertex of G by a point (at which 
we put the name of the vertex), and then, for each edge uv of G, we draw a 
curve that connects the point representing u with the point representing v. 
The positions of the points and the shapes of the curves can be chosen freely, 
as long as they allow the reader to unambiguously reconstruct the graph G 
from the picture. (Thus, for example, the curves should not pass through 
any points other than the ones they mean to connect.) 


Example 2.2.2. Let us draw some simple graphs. 


(a) The simple graph ({1,2,3}, {12,23}) (where we are again using the 
shorthand notation uv for {u,v}) can be drawn as follows: 


O 


This is (in a sense) the simplest way to draw this graph: The edges are 
represented by straight lines. But we can draw it in several other ways as 


well — e.g., as follows: 


Here, we have placed the points representing the vertices 1,2,3 differently. 
As a consequence, we were not able to draw the edge 12 as a straight line, 
because it would then have overlapped with the vertex 3, which would make 
the graph ambiguous (the edge 12 could be mistaken for two edges 13 and 
32). 

Here are three further drawings of the same graph ({1,2,3}, {12,23}): 


A 


(b) Consider the 5-th coprimality graph Cop, defined in Example 


An introduction to graph theory, version August 2, 2023 page 12 


Here is one way to draw it: 


Here is another way to draw the same graph Cop;, with fewer intersections 
between edges: 


By appropriately repositioning the points corresponding to the five vertices 
of Cop;, we can actually get rid of all intersections and make all the edges 
straight (as opposed to curved). Can you find out how? 


(c) Let us draw one further graph: the simple graph 
({1,2,3,4,5}, Pz ({1,2,3,4,5})). This is the simple graph whose ver- 
tices are 1,2,3,4,5, and whose edges are all possible two-element sets 
consisting of its vertices (i.e., each pair of two distinct vertices is adjacent). 
We shall later call this graph the “complete graph K5”. Here is a simple way 
to draw this graph: 


This drawing is useful for many purposes; for example, it makes the ab- 
stract symmetry of this graph (i.e., the fact that, roughly speaking, its vertices 
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1,2,3,4,5 are “equal in rights”) obvious. But sometimes, you might want to 
draw it differently, to minimize the number of intersecting curves. Here is a 
drawing with fewer intersections: 


In this drawing, we have only one intersection between two curves left. Can 
we get rid of all intersections? 

This is a question of topology, not of combinatorics, since it really is about 
curves in the plane rather than about finite sets and graphs. The answer is 
“no”. (That is, no matter how you draw this graph in the plane, you will 
always have at least one pair of curves intersect.) This is a classical result 
(one of the first theorems in the theory of planar graphs), and proofs of it 
can be found in various textbooks (e.g., Theorem 4.1.2], which is 
generally a good introduction to planar graph theory even if it uses termi- 
nology somewhat different from ours). Note that any proof must use some 
analysis or topology, since the result relies on the notion of a (continuous) 
curve in the plane (if curves were allowed to be non-continuous, then they 
could “jump over” one another, so they could easily avoid intersecting!). 


2.3. A first fact: The Ramsey number R (3,3) = 6 


Enough definitions; let’s state a first result: 


Proposition 2.3.1. Let G be a simple graph with |V (G)| > 6 (that is, G has at 
least 6 vertices). Then, at least one of the following two statements holds: 


e Statement 1: There exist three distinct vertices a, b and c of G such that 
ab, bc and ca are edges of G. 


e Statement 2: There exist three distinct vertices a, b and c of G such that 
none of ab, bc and ca is an edge of G. 


In other words, Proposition says that if a graph G has at least 6 vertices, 
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then we can either find three distinct vertices that are mutually adjacent or find 
three distinct vertices that are mutually non-adjacent (i.e., no two of them are 
adjacent), or both. Often, this is restated as follows: “In any group of at least 
six people, you can always find three that are (pairwise) friends to each other, 
or three no two of whom are friends” (provided that friendship is a symmetric 
relation). 

We will give some examples in a moment, but first let us introduce some 
convenient terminology: 


Definition 2.3.2. Let G be a simple graph. 


(a) A set {a,b,c} of three distinct vertices of G is said to be a triangle (of 
G) if every two distinct vertices in this set are adjacent (i.e., if ab, bc and 
ca are edges of G). 


(b) A set {a,b,c} of three distinct vertices of G is said to be an anti-triangle 
(of G) if no two distinct vertices in this set are adjacent (i.e., if none of 
ab, bc and ca is an edge of G). 


Thus, Proposition says that every simple graph with at least 6 vertices 
contains a triangle or an anti-triangle (or both). 


Example 2.3.3. Let us show two examples of graphs G to which Proposi- 
tion applies, as well as an example to which it does not: 


(a) Let G be the graph (V,E), where 


V = {1,2,3,4,5,6} and 
E = {{1,2}, {23}, {3,4}, {45}, {5,6}, {6,1}}. 


(This graph can be drawn in such a way as to look like a hexagon: 


) This graph satisfies Proposition [2.3.1] since {1,3,5} is an anti-triangle 
(or since {2,4,6} is an anti-triangle). 


*by which we mean (of course) that any two distinct ones among these three vertices are 
adjacent 
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(b) Let G be the graph (V,E), where 


V = {1,2,3,4,5,6} and 
E = {{1,2}, {2,3}, {3,4}, {45}, {5,6}, {6,1}, {1,3}, {4,6}}. 


(This graph can be drawn in such a way as to look like a hexagon with 
two extra diagonals: 


) This graph satisfies Proposition since {1,2,3} is a triangle. 
(c) Let G be the graph (V, E), where 


V = {1,2,3,4,5} and 
E = {{1,2}, {2,3}, {34}, {4.5}, (5,1 


(This graph can be drawn to look like a pentagon: 


) Proposition [2.3.1|says nothing about this graph, since this graph does 
not satisfy the assumption of Proposition (in fact, its number of 
vertices |V (G)| fails to be > 6). By itself, this does not yield that the 
claim of Proposition is false for this graph. However, it is easy 
to check that the claim actually is false for this graph: It has neither a 
triangle nor an anti-triangle. 


Proof of Proposition We need to prove that G has a triangle or an anti- 
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triangle (or both). 

Choose any vertex u € V (G). (This is clearly possible, since |V (G)| > 6 > 1.) 
Then, there are at least 5 vertices distinct from u (since G has at least 6 vertices). 
We are in one of the following two cases: 

Case 1: The vertex u has at least 3 neighbors. 

Case 2: The vertex u has at most 2 neighbors. 

Let us consider Case 1 first. In this case, the vertex u has at least 3 neighbors. 
Hence, we can find three distinct neighbors p, q and r of u. Consider these p, q 
and r. If one (or more) of pq, qr and rp is an edge of G, then G has a triangle 
(for example, if pq is an edge of G, then {u, p,q} is a triangle). If not, then G has 
an anti-triangle (namely, {p,q,r}). Thus, in either case, our proof is complete 
in Case 1. 

Let us now consider Case 2. In this case, the vertex u has at most 2 neighbors. 
Hence, the vertex u has at least 3 non-neighborg}| (since there are at least 5 
vertices distinct from u in total). Thus, we can find three distinct non-neighbors 
p, q and r of u. Consider these p, q and r. If all of pq, qr and rp are edges of G, 
then G has a triangle (namely, {p,q,r}). If not, then G has an anti-triangle (for 
example, if pq is not an edge of G, then {u, p,q} is an anti-triangle). In either 
case, we are thus done with the proof in Case 2. Thus, both cases are resolved, 
and the proof is complete. o 


Notice the symmetry between Case 1 and Case 2 in our above proof: the ar- 
guments used were almost the same, except that neighbors and non-neighbors 
swapped roles. 


Remark 2.3.4. Proposition could also be proved by brute force as well 
(using a computer). Indeed, it clearly suffices to prove it for all simple graphs 
with 6 vertices (as opposed to > 6 vertices), because if a graph has more than 
6 vertices, then we can just throw away some of them until we have only 6 
left. However, there are only finitely many simple graphs with 6 vertices (up 
to relabeling of their vertices), and the validity of Proposition can be 
checked for each of them. This is, of course, cumbersome (even a computer 
would take a moment checking all the 21° possible graphs for triangles and 
anti-triangles) and unenlightening. 


Proposition|2.3.ljis the first result in a field of graph theory known as Ramsey 
theory. I shall not dwell on this field in this course, but let me make a few more 
remarks. The first step beyond Proposition [2.3.1ļis the following generalization: 

Proposition 2.3.5. Let r and s be two positive integers. Let G be a simple 

= 2 
graph with |V (G)| > ( Wa i ) Then, at least one of the following two 
statements holds: 


3The word “non-neighbor” shall here mean a vertex that is not adjacent to u and distinct from 
u. Thus, u does not count as a non-neighbor of u. 
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e Statement 1: There exist r distinct vertices of G that are mutually adja- 
cent (i.e., each two distinct ones among these r vertices are adjacent). 


e Statement 2: There exist s distinct vertices of G that are mutually non- 
adjacent (i.e., no two distinct ones among these s vertices are adjacent). 


Applying Proposition to r = 3 and s = 3, we can recover Proposi- 
tion |2.3.1 


One might wonder whether the number (" irs 


pic in Proposition can 

be improved - i.e., whether we can replace it by a smaller number without 

making Proposition false. In the case of r = 3 and s = 3, this is im- 

possible, because the number 6 in Proposition cannot be made smalle 
= 2, 

However, for some other values of r and s, the value C ja 1 ) can be im- 

proved. (For example, for r = 4 and s = 4, the best possible value is 18 rather 


4+4—2 
than ( a 1 ) = 20.) The smallest possible value that could stand in place 


=2 
of ( rs ; ) in Proposition [2.3.5]is called the Ramsey number R (r,s); thus, 
we have just showed that R (3,3) = 6. Finding R(r,s) for higher values of r 
and s is a hard computational challenge; here are some values that have been 
found with the help of computers: 


R(3,4)=9; R(3,5)=14, R(3,6)=18; R(3,7) =23; 
R (3,8) =28; R(3,9)=36; R(4,4)=18; R(4,5)=25. 


(We are only considering the cases r < s, since it is easy to see that R(r,s) = 
R(s,r) for all r and s. Also, the trivial values R (1,s) = 1 and R(2,s) =s+1 
for s > 2 are omitted.) The Ramsey number R (5,5) is still unknown (although 
it is known that 43 < R (5,5) < 48). 

Proposition [2.3.5] can be further generalized to a result called Ramsey’s theo- 
rem. The idea behind the generalization is to slightly change the point of view, 
and replace the simple graph G by a complete graph (i.e., a simple graph in 
which every two distinct vertices are adjacent) whose edges are colored in two 
colors (say, blue and red). This is a completely equivalent concept, because 
the concepts of “adjacent” and “non-adjacent” in G can be identified with the 
concepts of “adjacent through a blue edge” (i.e., the edge connecting them is 
colored blue) and “adjacent through a red edge”, respectively. Statements 1 
and 2 then turn into “there exist r distinct vertices that are mutually adjacent 
through blue edges” and “there exist s distinct vertices that are mutually adja- 
cent through red edges”, respectively. From this point of view, it is only logical 


“Indeed, we saw in Example (c) that 5 vertices would not suffice. 
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to generalize Proposition [2.3.5] further to the case when the edges of a complete 
graph are colored in k (rather than two) colors. The corresponding generaliza- 
tion is known as Ramsey’s theorem. We refer to the well-written Wikipedia 
page for a treatment of 
this generalization with proof, as well as a table of known Ramsey numbers 
R (r,s) and a self-contained (if somewhat terse) proof of Proposition 2.3.5] Ram- 
sey’s theorem can be generalized and varied further; this usually goes under 
the name “Ramsey theory”. For elementary introductions, see the Cut-the-knot 
pagelhttp: //www. cut-the-knot..org/Curriculun/Conbinatorics/ThreeO:Three. shtml) 
, the above-mentioned Wikipedia article, as well as the texts by Harju [Harju 4], 
Bollobas and West [[West01]. 


There is one more direction in which Proposition [2.3.1)can be improved a bit: 
A graph G with at least 6 vertices has not only one triangle or anti-triangle, but 
at least two of them (this can include having one triangle and one anti-triangle). 
Proving this makes for a nice exercise: 


Exercise 2.1. Let G be a simple graph. A triangle-or-anti-triangle in G means 
a set that is either a triangle or an anti-triangle. 


(a) Assume that |V (G)| > 6. Prove that G has at least two triangle-or-anti- 
triangles. (For comparison: Proposition shows that G has at least 
one triangle-or-anti-triangle.) 


(b) Assume that |V (G)| = m + 6 for some m € N. Prove that G has at least 


m + 1 triangle-or-anti-triangles. 


[Solution: This is Exercise 1 on homework set #1 from my Spring 2017 


course; see the course page for solutions.] 


2.4. Degrees 


The degree of a vertex in a simple graph just counts how many edges contain 
this vertex: 


Definition 2.4.1. Let G = (V,E) be a simple graph. Let v € V be a vertex. 
Then, the degree of v (with respect to G) is defined to be 


deg v := (the number of edges e € E that contain v) 


(the number of neighbors of v) 
l{ueV | uve E}| 
{ee E | vee}. 
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(These equalities are pretty easy to check: Each edge e € E that contains v 
contains exactly one neighbor of v, and conversely, each neighbor of v belongs 
to exactly one edge that contains v. However, these equalities are specific to 
simple graphs, and won’t hold any more once we move on to multigraphs.) 

For example, in the graph 


the vertices have degrees 
deg1 = 3, deg2 = 2, deg3 = 3, deg4 = 2, deg5 = 0. 


Here are some basic properties of degrees in simple graphs: 


Proposition 2.4.2. Let G be a simple graph with n vertices. Let v be a vertex 
of G. Then, 
degv € {0,1,...,n— 1}. 


Proof. All neighbors of v belong to the (n — 1)-element set V (G) \ {v}. Thus, 
their number is < n — 1. O 


Proposition 2.4.3 (Euler 1736). Let G be a simple graph. Then, the sum of 
the degrees of all vertices of G equals twice the number of edges of G. In 


other words, 
ð}. degv =2-|E(G)|. 
vEV(G) 


Proof. Write the simple graph G as G = (V, E); thus, V (G) = V and E (G) = E. 
Now, let N be the number of all pairs (v,e) € V x E such that v € e. We 
compute N in two different ways (this is called “double-counting”): 


1. We can obtain N by computing, for each v € V, the number of all e € E 
that satisfy v € e, and then summing these numbers over all v. Since these 


numbers are just the degrees deg v, the result will be }, degv. 
vEV 


2. On the other hand, we can obtain N by computing, for each e € E, the 
number of all v € V that satisfy v € e, and summing these numbers over 
all e. Since each e € E contains exactly 2 vertices v € V, this result will be 

L 2=]|E|-2= 2. ]E]. 
ecE 
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Since these two results must be equal (because they both equal N), we thus 
see that } degv = 2- |E|. But this is the claim of Proposition 2.4.3] oO 
vEV 


Corollary 2.4.4 (handshake lemma). Let G be a simple graph. Then, the 
number of vertices v of G whose degree deg v is odd is even. 


Proof. Proposition|2.4.3lyields that }, degv = 2.|E(G)|. Hence, }, degv 
vEV(G) vEV(G) 
is even. However, if a sum of integers is even, then it must have an even number 


of odd addends. Thus, the sum }, degv must have an even number of odd 
veEV(G) 

addends. In other words, the number of vertices v of G whose degree deg v is 

odd is even. oO 


Corollary [2.4.4]is often stated as follows: In a group of people, the number of 
persons with an odd number of friends (in the group) is even. It is also known 
as the handshake lemma. 

Here is another property of degrees in a simple graph: 


Proposition 2.4.5. Let G be a simple graph with at least two vertices. Then, 
there exist two distinct vertices v and w of G that have the same degree. 


Proof. Assume the contrary. So the degrees of all n vertices of G are distinct, 
where n = |V (G)|. 
In other words, the map 


deg: V (G) > {0,1,...,n— 1}, 
v++ degv 


is injective. But this is a map between two finite sets of the same size (n). When 
such a map is injective, it has to be bijective (by the pigeonhole principle). 
Therefore, in particular, it takes both 0 and n — 1 as values. 

In other words, there are a vertex u with degree 0 and a vertex v with degree 
n — 1. Are these two vertices adjacent or not? Yes because of deg v = n — 1; no 
because of degu = 0. Contradiction! 

(Fine print: The two vertices u and v must be distinct, since 0 Æ n — 1. It is 
here that we are using the “at least two vertices” assumption!) O 


Here is an application of counting neighbors to proving a fact about graphs. 
This is known as Mantel’s theorem: 


Theorem 2.4.6 (Mantel’s theorem). Let G be a simple graph with n vertices 
and e edges. Assume that e > n?/4. Then, G has a triangle (i.e., three distinct 
vertices that are pairwise adjacent). 
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Example 2.4.7. Let G be the graph (V,E), where 


V = {1,2,3,4,5,6}; 
E = {12, 23, 34, 45, 56, 61, 14, 25, 36}. 


Here is a drawing: 


This graph has no triangle (which, by the way, is easy to verify without 
checking all possibilities: just observe that every edge of G joins two vertices 
of different parity, but a triangle would necessarily have two vertices of equal 
parity). Thus, by the contrapositive of Mantel’s theorem, it satisfies e < n?/4 
with n = 6 and e = 9. This is indeed true because 9 = 67/4. But this also 
entails that if we add any further edge to G, then we obtain a triangle. 


Proof of Mantel’s theorem. We will prove the theorem by strong induction on n. 
Thus, we assume (as the induction hypothesis) that the theorem holds for all 
graphs with fewer than n vertices. We must now prove it for our graph G with 
n vertices. Let V = V(G) and E = E(G), so that G = (V,E). 

We must prove that G has a triangle. Assume the contrary. Thus, G has no 
triangle. 

From e > n*/4 > 0, we see that G has an edge. Pick any such edge, and call 
it vw. Thus, v £ w. 

Let us now color each edge of G with one of three colors, as follows: 


e The edge vw is colored black. 
e Each edge that contains exactly one of v and w is colored red. 


e All other edges are colored blue. 
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The following picture shows an example of this coloring: 


We now count the edges of each color: 


e There is exactly 1 black edge - namely, vw. 


e How many red edges can there be? I claim that there are at most n — 2. 
Indeed, each vertex other than v and w is connected to at most one of v 
and w by a red edge, since otherwise it would form a triangle with v and 
w. 


e How many blue edges can there be? The vertices other than v and w, 
along with the blue edges that join them, form a graph with n — 2 vertices; 
this graph has no triangles (since G has no triangles). By the induction 
hypothesis, however, if this graph had more than (n — 2)? /4 edges, then 
it would have a triangle. Thus, it has < (n — 2) /4 edges. In other words, 
there are < (n — 2)” /4 blue edges. 


In total, the number of edges is therefore 


<14(n—2)+(n—-2) /4=n7/4. 


In other words, e < n?/4. This contradicts e > n2/4. This is the contradiction 
we were looking for, so the induction is complete. O 


Quick question: What about equality? Can a graph with n vertices and 
exactly n?/4 edges have no triangles? Yes (for even n). Indeed, for any even n, 
we can take the graph 


({1,2,...,n}, {ij | i# jmod2}) 


An introduction to graph theory, version August 2, 2023 page 23 


(keep in mind that ij means the 2-element set {i,j} here, not the product i - j). 
We can also do this for odd n, and obtain a graph with (n? — 1) /4 edges (which 
is as close to n?/4 as we can get when n is odd — after all, the number of edges 
has to be an integer). So the bound in Mantel’s theorem is optimal (as far as 
integers are concerned). 


The following exercise can be regarded as a “mirror version” of Mantel’s 
theorem: 


Exercise 2.2. Let G be a simple graph with n vertices and e edges. Assume 
that e < n(n—2)/4. Prove that G has an anti-triangle (i.e., three distinct 
vertices that are pairwise non-adjacent). 


[Solution: This is Exercise 2 on homework set #1 from my Spring 2017 
course; see the course page for solutions.] 
Mantel’s theorem can be generalized: 


Theorem 2.4.8 (Turan’s theorem). Let r be a positive integer. Let G be a 
simple graph with n vertices and e edges. Assume that 


N 


r—l1 n 


e > —. 
r 2 


Then, there exist r + 1 distinct vertices of G that are mutually adjacent. 


Mantel’s theorem is the particular case for r = 2. We will see a proof of 
Turan’s theorem later (Theorem 7.3.1). Mantel’s and Turan’s theorems are two 
of the simplest results of extremal graph theory — the study of how inequalities 
between some graph parameters (in our case: the numbers of vertices and 
edges) imply the existence of certain substructures (in our case: of a triangle or 
of r + 1 mutually adjacent vertices). Deeper introductions to this subject can be 


found in [Zhao23) Chapters 1 and 5] and |Jukna11]. 


Exercise 2.3. Let G = (V,E) be a simple graph. Set n = |V|. Prove that we 
can find some edges ¢1,é2,...,e, of G and some triangles t, to,...,t¢ of G 
such that k + £ < n*/4 and such that each edge e € E \ {e},€0,..., ex} is a 
subset of (at least) one of the triangles t4, to,...,te. 


[Remark: In other words, this exercise is claiming that all edges of G can be 
covered by at most n? /4 edge-or-triangles. Here, an edge-or-triangle means 
either an edge or a triangle of G, and the word “covers” means that each 
edge of G is a subset of the chosen edge-or-triangles. ] 


[Hint: Imitate the above proof of Mantel’s theorem.] 
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Remark 2.4.9. Exercise |2.3]is a generalization of Mantel’s theorem. Indeed, if 
the simple graph G = (V, E) has no triangles, then the number £ in Exercise 
[2.3|must be 0, and thus the edges e1, e2, . . ., ey must be all edges of G, so that 
we conclude that |E| = k < k+ £ < n? /4. 


Exercise 2.4. Let G be a simple graph with n vertices and k edges, where 


k 
n > 0. Prove that G has at least a (4k — n?) triangles. 


[Hint: First argue that for any edge vw of G, the total number of triangles 
that contain v and w is at least degv + degw — n. Then, use the inequal- 
ity n (at +a5+-+-+a2) > (a +a: +-+ an)”, which holds for any n real 
numbers 41,42,...,@n. (This is a particular case of the Cauchy—Schwarz in- 
equality or the Chebyshev inequality or the Jensen inequality — pick your 
favorite!) | 


Remark 2.4.10. Exercise is known as the Moon-Moser inequality for 
triangles. It, too, generalizes Mantel’s theorem: If k > n?/4, then 


k 
aa (4k — n?) > 0, and therefore Exercise entails that G has at least one 


triangle. 


Exercise 2.5. Let G = (V, E) be a simple graph. 

An edge e = {u,v} of G will be called odd if the number deg u + deg v is 
odd. 

Prove that the number of odd edges of G is even. 


[Hint: There are several solutions. One uses modular arithmetic and (in 
particular) the congruence m? = m mod 2 for every integer m. Other solu- 
tions use nothing but common sense. ] 


Exercise 2.6. Let G = (V, E) be a simple graph. Let S be a subset of V, and 
let k = |S|. Prove that 


yi degv <k(k—1)+ | min {degz,k}. 
ves vEV\S 


Remark 2.4.11. Exercise [2.6] has a converse (the so-called Erdés—Gallai theo- 


rem): If d|,d2,...,d, are n nonnegative integers such that dı + d2+---+dn 
is even and such that dı > dz > --- > dn and such that each k € {1,2,...,n} 
satisfies 

n 
1 i=k+1 


Me 


~ 


then there exists a simple graph with vertex set {1,2,...,n} whose vertices 
have degrees dj, d2,...,dy. 
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2.5. Graph isomorphism 


Two graphs can be distinct and yet “the same up to the names of their vertices”: 
for instance, 


O= i OS -®2), 


Let us formalize this: 
Definition 2.5.1. Let G and H be two simple graphs. 


(a) A graph isomorphism (or isomorphism) from G to H means a bijection 
~: V (G) > V (H) that “preserves edges”, i.e., that has the following 
property: For any two vertices u and v of G, we have 


(uv € E(G)) ==> (¢ (u) ¢ (v) € E (H)). 
(b) We say that G and H are isomorphic (this is written G = H) if there 
exists a graph isomorphism from G to H. 
Here are two examples: 


e The two graphs 


Oe © es a Oe On 


are isomorphic, because the bijection between their vertex sets that sends 
1,2,3 to 1,3,2 is an isomorphism. Another isomorphism between the 
same two graphs sends 1,2,3 to 2,3,1. 


e The two graphs 


@ (6) and 


are isomorphic, because the bijection between their vertex sets that sends 
1,2,3,4,5,6 to 1, B,3, A,2,C is an isomorphism. 
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Here are some basic properties of isomorphisms (the proofs are straightfor- 
ward): 


Proposition 2.5.2. Let G and H be two graphs. The inverse of a graph iso- 
morphism ¢ from G to H is a graph isomorphism from H to G. 


phism from G to H, and y is a graph isomorphism from H to I, then po @ is 


Proposition 2.5.3. Let G, H and I be three graphs. If @ is a graph isomor- 
a graph isomorphism from G to I. 


As a consequence of these two propositions, it is easy to see that the relation 
= (on the class of all graphs) is an equivalence relation. 

Graph isomorphisms preserve all “intrinsic” properties of a graph. For ex- 
ample: 


Proposition 2.5.4. Let G and H be two simple graphs, and ¢ a graph isomor- 
phism from G to H. Then: 


(a) For every v € V(G), we have deggv = deg,, (~(v)). Here, deg. v 
means the degree of v as a vertex of G, whereas deg,, (¢ (v)) means the 
degree of ¢ (v) as a vertex of H. 


(b) We have |E(H)| = |E(G)|. 
(c) We have |V (H)| = |V (G)|. 
One use of graph isomorphisms is to relabel the vertices of a graph. For 


example, we can relabel the vertices of an n-vertex graph as 1,2,...,n, or as 
any other n distinct objects: 


Proposition 2.5.5. Let G be a simple graph. Let S be a finite set such that 
|S| = |V(G)|. Then, there exists a simple graph H that is isomorphic to G 
and has vertex set V (H) = S. 


Proof. Straightforward. o 


2.6. Some families of graphs 


We will now define some particularly significant families of graphs. 


2.6.1. Complete and empty graphs 
The simplest families of graphs are the complete graphs and the empty graphs: 
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Definition 2.6.1. Let V be a finite set. 


(a) The complete graph on V means the simple graph (V, P2(V)). It is 
the simple graph with vertex set V in which every two distinct vertices 
are adjacent. 


If V = {1,2,...,n} for some n € N, then the complete graph on V is 
denoted K,,. 


(b) The empty graph on V means the simple graph (V, æ). It is the simple 
graph with vertex set V and no edges. 


The following pictures show the complete graph and the empty graph on the 
set {1,2,3,4,5}: 


complete graph empty graph 


The complete one is called Ks. 
Here are the complete graphs Ko, Kı, K2, K3, Ka: 


Note that a simple graph G is isomorphic to the complete graph K, if and 
only if it has n vertices and is a complete graph (i.e., every two distinct vertices 
are adjacent). 
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Question: Given two finite sets V and W, what are the isomorphisms from 
the complete graph on V to the complete graph on W ? 

Answer: If |V| 4 |W], then there are none. If |V| = |W], then any bijection 
from V to W is an isomorphism. The same holds for empty graphs. 


2.6.2. Path and cycle graphs 

Next come two families of graphs with fairly simple shapes: 
Definition 2.6.2. For each n € N, we define the n-th path graph P, to be the 
simple graph 


M2 (ier | tsi) 
= ({1,2,...,n}, {12, 23, 34,..., (a—1)n}). 


This graph has n vertices and n — 1 edges (unless n = 0, in which case it has 
0 edges). 


Definition 2.6.3. For each n > 1, we define the n-th cycle graph C,, to be the 
simple graph 


({1,2,... n}, {{%Zi+1} | 1<i<n}U{{n,1}}) 
= ({1,2,..., n}, {12, 23, 34, ..., (n—1)n, 1s 


This graph has n vertices and n edges (unless n = 2, in which case it has 1 
edge only). (We will later modify the definition of the 2-nd cycle graph C3 
somewhat, in order to force it to have 2 edges. But we cannot do this yet, 
since a simple graph with 2 vertices cannot have 2 edges.) 


The following pictures show the path graph P; and the cycle graph Cs: 


path graph cycle graph 


Of course, it is more common to draw the path graph stretched out horizontally: 
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Note that the cycle graph C3 is identical with the complete graph K3. 
Question: What are the graph isomorphisms from P, to itself? 
Answer: One such isomorphism is the identity map id : {1,2,...,n} — 
{1,2,...,n}. Another is the “reversal” map 
{1,2,... n} > {1,2,... n}, 
intl- i. 
There are no others. 


Question: What are the graph isomorphisms from C; to itself? 
Answer: For any k € Z, we can define a “rotation by k vertices”, which is the 
map 


{1,2,... n} > {1,2,... n}, 
i œ (i +k reduced modulo n to an element of {1,2,...,n}). 


Thus we get n rotations (one for each k € {1,2,...,n}); all of them are graph 
isomorphisms. 
There are also the reflections, which are the maps 


{1,2,... n} > {1,2,... n}, 
i œ> (k — i reduced modulo n to an element of {1,2,...,n}) 


for k € Z. There are n of them, too, and they are isomorphisms as well. 
Altogether we obtain 2n isomorphisms (for n > 2), and there are no others. 

(The group they form is the n-th dihedral group.) 

2.6.3. Kneser graphs 


Here is a more exotic family of graphs: 


Example 2.6.4. If S is a finite set, and if k € N, then we define the k-th Kneser 
graph of S to be the simple graph 


Ksx := (Pk (S), {IJ | LJ € Py (S) and INJ = Ø}). 


The vertices of Ks x are the k-element subsets of S, and two such subsets are 
adjacent if they are disjoint. 
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The graph Ky1 9 51,2 is called the Petersen graph; here is how it looks like: 


2.7. Subgraphs 
Definition 2.7.1. Let G = (V,E) bea simple graph. 


(a) A subgraph of G means a simple graph of the form H = (W, F), where 
W C V and F C E. In other words, a subgraph of G means a simple 
graph whose vertices are vertices of G and whose edges are edges of G. 


(b 


Å 


Let S be a subset of V. The induced subgraph of G on the set S denotes 
the subgraph 
(S, EAP:(S)) 


of G. In other words, it denotes the subgraph of G whose vertices 
are the elements of S, and whose edges are precisely those edges of G 
whose both endpoints belong to S. 


(c 


Å 


An induced subgraph of G means a subgraph of G that is the induced 
subgraph of G on S for some S C V. 
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Thus, a subgraph of a graph G is obtained by throwing away some vertices and 
some edges of G (in such a way, of course, that no edges remain “dangling” 
— i.e., if you throw away a vertex, then you must throw away all edges that 
contain this vertex). Such a subgraph is an induced subgraph if no edges are 
removed without need — i.e., if you removed only those edges that lost some of 
their endpoints. Thus, induced subgraphs can be characterized as follows: 


Proposition 2.7.2. Let H be a subgraph of a simple graph G. Then, H is an 
induced subgraph of G if and only if each edge uv of G whose endpoints u 
and v belong to V (H) is an edge of H. 


Proof. This is a matter of understanding the definition. o 


Example 2.7.3. Let n > 1 be an integer. 


(a) The path graph P, is a subgraph of the cycle graph Cy. It is not an 
induced subgraph (for n > 2), because it contains the two vertices n 
and 1 of C, but does not contain the edge n1. 


(b) The path graph P,,_; is an induced subgraph of P,. (Namely, it is the 
induced subgraph of P, on the set {1,2,...,n —1}.) 


(c) Assume that n > 3. Is C,_1 a subgraph of C, ? No, because the edge 
(n — 1) 1 belongs to C,_; but not to Cn. 


The following is easy: 


Proposition 2.7.4. Let G be a simple graph, and let H be a subgraph of G. 
Assume that H is a complete graph. Then, H is automatically an induced 
subgraph of G. 


Proof. This follows from Proposition since the completeness of H means 
that each 2-element subset {u,v} of the vertex set of H is an edge of H. O 


We note that triangles in a graph can be characterized in terms of complete 
subgraphs. Namely, a triangle “is” the same as a complete subgraph (or, equiv- 
alently, induced complete subgraph) with three vertices: 


Remark 2.7.5. Let G be a simple graph. Let u,v, w be three distinct vertices 
of G. The following are equivalent: 


1. The set {u, v, w} is a triangle of G. 
2. The induced subgraph of G on {u,v, w} is isomorphic to K3. 


3. The induced subgraph of G on {u,v,w} is isomorphic to C3. 
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Thus, instead of saying “triangle of G”, one often says “a K3 in G” or “a C3 in 
G”. Generally, “an H in G” (where H and G are two graphs) means a subgraph 
of G that is isomorphic to H. (In the case when H = K3 = C3, it does not 
matter whether we require it to be a subgraph or an induced subgraph, since a 
complete subgraph has to be induced automatically.) 


Exercise 2.7. Let n be a positive integer. Let S be a simple graph with 2n 
vertices. Prove that S has two distinct vertices that have an even number of 
common neighbors. 


Exercise 2.8. Let n > 2 be an integer. Let G be a simple graph with n vertices. 


(a) Describe G if the degrees of the vertices of G are 1,1,...,1,n — 1. 


(b) Let a and b be two positive integers such that a+ b = n. Describe G if 
the degrees of the vertices of G are 1,1,...,1,a,b. 


Here, to “describe” G means to explicitly determine (with proof) a graph 
that is isomorphic to G. 


Remark 2.7.6. The situations in Exercise |2.8]are, in a sense, exceptional. Typ- 
ically, the degrees of the vertices of a graph do not uniquely determine the 
graph up to isomorphism. For example, the two graphs 


6) and 


are not isomorphid?, but have the same degrees (namely, each vertex of either 
graph has degree 3). 


°The easiest way to see this is to observe that the second graph has a triangle (i.e., three 
distinct vertices that are mutually adjacent), while the first graph does not. 
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2.8. Disjoint unions 


Another way of constructing new graphs from old is the disjoint union. The 
idea is simple: Taking the disjoint union G1 L G2 LI --- LI Gy of several simple 
graphs G1, G2,..., Gk means putting the graphs alongside each other and treat- 
ing the result as one big graph. To make this formally watertight, we have to 
relabel each vertex v of each graph G; as the pair (i,v), so that vertices coming 
from different graphs appear as different even if they were equal. For example, 
the disjoint union C3! C4 of the two cycle graphs C3 and C4 should not be 


>N 


(which makes no sense, because there are two points labelled 1 in this picture, 
but a graph can have only one vertex 1), but rather should be 


So here is the formal definition: 


Definition 2.8.1. Let G1, G2,...,G, be simple graphs, where G; = (V;, E;) for 
each į € {1,2,...,k}. The disjoint union of these k graphs G4, G2,...,G, is 
defined to be the simple graph (V, E), where 


V={(i,0) | i€ {1,2,...,k} and v € V;} and 
E = {{(i,01),(i,02)} | i€ {1,2,...,k} and {v1, v2} € E;}. 
This disjoint union is denoted by G4 U Gp U + -- L Gx. 


Note: If G and H are two graphs, then the two graphs G U H and H U G are 
isomorphic, but not the same graph (unless G = H). For example, C3 U Cy has 
a vertex (2,4), but C4 U C3 does not. 
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2.9. Walks and paths 


We now come to the definitions of walks and paths — two of the most funda- 
mental features that graphs can have. In particular, Euler’s 1736 paper, where 
graphs were first studied, is about certain kinds of walks. 


2.9.1. Definitions 


Imagine a graph as a road network, where each vertex is a town and each edge 
is a (bidirectional) road. By successively walking along several edges, you can 
often get from a town to another even if they are not adjacent. This is made 
formal in the concept of a “walk”: 


Definition 2.9.1. Let G be a simple graph. Then: 


(a) A walk (in G) means a finite sequence (v9, 01,...,0,%) of vertices of G 
(with k > 0) such that all of vovi, 0102, V203, ..., Ug—10,% are edges of 
G. (The latter condition is vacuously true if k = 0.) 


(b) If w = (vo, v1, .--, Vg) is a walk in G, then: 


e The vertices of w are defined to be vg, 01,..., Uk. 
e The edges of w are defined to be vovi, 0102, V203, ..., Uk—10k- 


e The nonnegative integer k is called the length of w. (This is the 
number of all edges of w, counted with multiplicity. It is 1 smaller 
than the number of all vertices of w, counted with multiplicity.) 


e The vertex vo is called the starting point of w. We say that w starts 
(or begins) at vo. 


e The vertex vx is called the ending point of w. We say that w ends 
at Ur. 


(c) A path (in G) means a walk (in G) whose vertices are distinct. In other 
words, a path means a walk (vo, v1, ..., Vg) such that vo, 01,...,U, are 
distinct. 


(d) Let p and q be two vertices of G. A walk from p to g means a walk that 
starts at p and ends at g. A path from p to q means a path that starts at 
p and ends at q. 


(e) We often say “walk of G” and “path of G” instead of “walk in G” and 
“path in G”, respectively. 


Example 2.9.2. Let G be the graph 


({1,2,3,4,5,6}, {12, 23, 34, 45, 56, 61, 13}). 
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This graph looks as follows: 


Then: 


The sequence (1,3,4,5,6,1,3,2) of vertices of G is a walk in G. This 
walk is a walk from 1 to 2. It is not a path. The length of this walk is 7. 


The sequence (1,2,4,3) of vertices of G is not a walk, since 24 is not an 
edge of G. Hence, it is not a path either. 


The sequence (1,3,2,1) is a walk from 1 to 1. It has length 3. It is not a 
path. 


The sequence (1,2,1) is a walk from 1 to 1. It has length 2. It is not a 
path. 


The sequence (5) is a walk from 5 to 5. It has length 0. It is a path. 
More generally, each vertex v of G produces a length-0 path (v). 


The sequence (5,4) is a walk from 5 to 4. It has length 1. It is a path. 
More generally, each edge uv of G produces a length-1 path (u, v). 


Intuitively, we can think of walks and paths as follows: 


e A 


walk of a graph is a way of walking from one vertex to another (or to 


the same vertex) by following a sequence of edges. 


e A 


path is a walk whose vertices are distinct (i.e., each vertex appears at 


most once in the walk). 


Exercise 2.9. Let G be a simple graph. Let w be a path in G. Prove that the 
edges of w are distinct. (This may look obvious when you can point to a 
picture; but we ask you to give a rigorous proof!) 


[Solution: This is Exercise 3 on homework set #1 from my Spring 2017 


course; see the course page for solutions.] 
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2.9.2. Composing /concatenating and reversing walks 


Here are some simple things we can do with walks and paths. 
First, we can “splice” two walks together if the ending point of the first is the 
starting point of the second: 


Proposition 2.9.3. Let G be a simple graph. Let u, v and w be three vertices 
of G. Let a = (ag, 41,...,a,) be a walk from u to v. Let b = (bo,b1,...,b¢) be 
a walk from v to w. Then, 


(40,41, oe Ay bi, bz, . ., bo) = (40,41, tee ,aķk—1, bo, Di, : . bg) 
= (ao, 1,- ,Ak—1, 0, by, bz, e., bo) 
is a walk from u to w. This walk shall be denoted a * b. 
Proof. Intuitively clear and straightforward to verify. O 


Proposition 2.9.4. Let G be a simple graph. Let u and v be two vertices of G. 
Let a = (a0, 41,...,a,) be a walk from u to v. Then: 


(a) The list (ax, a¢_1,...,d9) is a walk from v to u. We denote this walk by 
rev a and call it the reversal of a. 


(b) If a is a path, then rev a is a path again. 


Proof. Intuitively clear and straightforward to verify. O 


2.9.3. Reducing walks to paths 


A path is just a walk without repeated vertices. If you have a walk, you can 
turn it into a path by removing “loops” (or “digressions”): 


Proposition 2.9.5. Let G be a simple graph. Let u and v be two vertices of G. 
Let a = (ao,41,...,4,) be a walk from u to v. Assume that a is not a path. 
Then, there exists a walk from u to v whose length is smaller than k. 


Proof. Since a is not a path, two of its vertices are equal. In other words, there 
exist i < j such that a; = aj. Consider these i and j. Now, consider the tuple 


0, 41,- -li , Aj41,4j427-+++ 1 Ak 
—_ — 


the first i+1 vertices of a the last k— j vertices of a 


(this is just a with the part between a; and a; cut out). This tuple is a walk from u 


to v, and its length is_i > + (k—j) <j+(k—j) =k. So we have found a walk 
<j 
from u to v whose length is smaller than k. This proves the proposition. o 
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Example 2.9.6. Consider the walk (1,3,4,5,6,1,3,2) from Example 
Then, Proposition tells us that there is a walk from 1 to 2 that has 
smaller length. You can find this walk by removing the part between the two 
3’s. You get the walk (1,3,2). This is actually a path. 


Corollary 2.9.7 (When there is a walk, there is a path). Let G be a simple 
graph. Let u and v be two vertices of G. Assume that there is a walk from u 
to v of length k for some k € IN. Then, there is a path from u to v of length 
<k. 


Proof. Proposition says that if there is a walk from u to v that is not a path, 
then there is a walk from u to v having shorter length. Apply this repeatedly, 
until you get a path. (You will eventually get a path, because the length cannot 
keep decreasing forever.) o 


2.9.4. Remark on algorithms 


We take a little break from proving structural theorems in order to address 
some important computational questions. As always in these notes, we will 
only scratch the surface and content ourselves with simple but not quite optimal 
algorithms. 

Given a simple graph G and two vertices u and v of G, we can ask ourselves 
the following questions: 


Question 1: Does G have a walk from u to v ? 
Question 2: Does G have a path from u to v ? 


Question 3: Find a shortest path from u to v (that is, a path from u 
to v having the smallest possible length), or determine that no such 
path exists. 


Question 4: Given a number k € N, find a walk from u to v having 
length k, or determine that no such walk exists. 


Question 5: Given a number k € N, find a path from u to v having 
length k, or determine that no such path exists. 


Corollary reveals that Questions 1 and 2 are equivalent (indeed, the 
existence of a walk from u to v entails the existence of a path from u to v 
by Corollary whereas the converse is obvious). Question 3 is clearly a 
stronger version of Question 2 (in the sense that any answer to Question 3 will 
automatically answer Question 2 as well). 


With a bit more thought, it is easily seen that Question 4 is a stronger version 
of Question 3. Indeed, Corollary [2.9.7/shows that a shortest walk from u to v (if 
it exists) must also be a shortest path from u to v. However, any path from u to 
v must have length < n — 1, where n is the number of vertices of G (since a path 
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of length k has k + 1 distinct vertices, but G has only n vertices to spare). Hence, 
if there is no walk of length < n — 1 from u to v, then there is no path from u to 
v whatsoever. Thus, if we answer Question 4 for all values k € {0,1,...,n—1}, 
then we obtain either a shortest path from u to v (by taking the smallest k for 
which the answer is positive, and then picking the resulting walk, which must 
be a shortest path by what we previously said), or proof positive that no path 
from u to v exists (if the answer for each k € {0,1,...,n — 1} is negative). 
Thus, answering Question 4 will yield answers to Questions 1, 2 and 3. 


Let us now outline a way how Question 4 can be answered using a recursive 
algorithm. Specifically, we recurse on k. The base case (k = 0) is easy: A walk 
from u to v having length 0 exists if u = v and does not exist otherwise. The 
interesting part is the recursion step: Assume that the integer k is positive, and 
that we already know how to answer Question 4 for k — 1 instead of k. Now, 
let us answer it for k. To do so, we observe that any walk from u to v having 
length k must have the form (u,...,w,v), where the penultimate vertex w is 
some neighbor of v. Moreover, if we remove the last vertex v from our walk 
(u,...,W,v), then we obtain a walk (u,...,w) of length k — 1. Hence, we can 
find a walk from u to v having length k as follows: 


e We make a list of all neighbors of v. We go through this list in some 
arbitrary order. 


e For each neighbor w in this list, we try to find a walk from u to w having 
length k — 1 (this is a matter of answering Question 4 for k — 1 instead of 
k, so we supposedly already know how to do this). If such a walk exists, 
then we simply insert v at its end, and thus obtain a walk from u to v 
having length k. Thus we obtain a positive answer to our question. 


e If we have gone through our whole list of neighbors of v without finding 
a walk from u to v having length k, then no such walk exists, and thus we 
have found a negative answer. 


This recursive algorithm answers Question 4, and is fast enough to be prac- 
tically viable if implemented_well. (In the language of complexity theory, it is 
a polynomial time algorithm() Much more efficient algorithms exist, however. 
In applications, a generalized version of Question 3 often appears, asking for 
a path that is shortest not in the sense of smallest length, but in the sense of 
smallest “weighted length” (i.e., different edges contribute differently to this 
“length”). This generalized question is one of the most fundamental algo- 
rithmic problems in computer science, known as the shortest path problem, 


and various algorithms can be found on lits Wikipedia page| and in algorithm- 
focussed texts such as [Griffi21) §3.5], [KelTro17} §12.3], [Schrij17, Chapter 1] or 
(for a royal treatment) [Schrij03, Chapters 6-8]. 


6To be specific: Its running time can be bounded in a polynomial of n and k, where n is the 
number of vertices of G. 
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Question 5 looks superficially similar to Question 4, yet it differs in the most 
important way: There is no efficient algorithm known for answering it! In the 
language of complexity theory, it is an which means that 
a polynomial-time algorithm for it is not expected to exist (although this is 
the kind of negative that appears near-impossible to prove at the current stage 
of the discipline). It is still technically a finite problem (there are only finitely 
many possible paths in G, and thus one can theoretically try them all), and there 
is even a polynomial-time algorithm for any fixed value of k (again, a trivial one: 
check all the n*+! possible (k + 1)-tuples of vertices of G for whether they are 
paths from u to v), but the complexity of this algorithm grows exponentially in 
k, which makes it useless in practice. 


2.9.5. The equivalence relation “‘path-connected” 


We can use the concepts of walks and paths to define a certain equivalence 
relation on the vertex set V (G) of any graph G: 


Definition 2.9.8. Let G be a simple graph. We define a binary relation ~ç on 
the set V (G) as follows: For two vertices u and v of G, we shall have u ~g v 
if and only if there exists a walk from u to v in G. 

This binary relation ~c¢ is called “path-connectedness” or just 
“connectedness”. When two vertices u and v satisfy u ~g v, we say that 
“u and v are path-connected”. 


Proposition 2.9.9. Let G be a simple graph. Then, the relation ~ç is an 
equivalence relation. 


Proof. We need to show that ~ç is symmetric, reflexive and transitive. 


e Symmetry: If u ~g v, then v ~g u, because we can take a walk from u to 
v and reverse it. 


e Reflexivity: We always have u ~ç u, since the trivial walk (u) is a walk 
from u to u. 


e Transitivity: If u ~g v and v ~g w, then u ~g w, because (as we know 
from Proposition |2.9.3) we can take a walk a from u to v and a walk b from 
v to w and combine them to form the walk a * b defined in Proposition 
2.9.3 


O 


Proposition 2.9.10. Let G be a simple graph. Let u and v be two vertices of 
G. Then, u ~ç v if and only if there exists a path from u to v. 


Proof. <=: Clear, since any path is a walk. 
=>: This is just saying that if there is a walk from u to v, then there is a path 
from u to v. But this follows from Corollary oO 
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2.9.6. Connected components and connectedness 


The equivalence relation ~ç introduced in Definition |2.9.8] allows us to define 
two important concepts: 


Definition 2.9.11. Let G be a simple graph. The equivalence classes of the 
equivalence relation ~ç are called the connected components (or, for short, 
components) of G. 

Definition 2.9.12. Let G be a simple graph. We say that G is connected if G 
has exactly one component. 


Thus, a simple graph G is connected if and only if it has at least one compo- 
nent (i.e., it has at least one vertex) and it has at most one component (i.e., each 
two of its vertices are path-connected). 


Example 2.9.13. Let G be the graph with vertex set {1,2,...,9} and such 
that two vertices i and j are adjacent if and only if |i — j| = 3. What are the 
components of G ? 

The graph G looks like this: 


This looks like a jumbled mess, so you might think that all vertices are mu- 
tually path-connected. But this is not the case, because edges that cross in 
a drawing do not necessarily have endpoints in common. Walks can only 
move from one edge to another at a common endpoint. Thus, there are 
much fewer walks than the picture might suggest. We have 1 ~g 4 ~g 7 
and 2 ~ç 5 ~g 8 and 3 ~ç 6 ~ç 9, but there are no further ~c-relations. In 
fact, two vertices of G are adjacent only if they are congruent modulo 3 (as 
numbers), and therefore you cannot move from one modulo-3 congruence 
class to another by walking along edges of G. So the components of G are 
{1,4,7} and {2,5,8} and {3,6,9}. The graph G is not connected. 


Example 2.9.14. Let G be the graph with vertex set {1,2,...,9} and such that 
two vertices i and j are adjacent if and only if |i — j| = 6. This graph looks 
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like this: 


What are the components of G ? They are {1,7} and {2,8} and {3,9} and 
{4} and {5} and {6}. Note that three of these six components are singleton 
sets. The graph G is not connected. 


Example 2.9.15. Let G be the graph with vertex set {1,2,...,9} and such that 
two vertices i and j are adjacent if and only if |i — j| = 3 or |i—j| = 4. This 
graph looks like this: 


We can take a long walk through G: 
(1,4,7,3,6,9,5,2,5,8). 


This walk traverses every vertex of G; thus, any two vertices of G are path- 
connected. Hence, G has only one component, namely {1,2,...,9}. Thus, G 
is connected. 


Example 2.9.16. The complete graph on a nonempty set is connected. The 
complete graph on the empty set is not connected, since it has 0 (not 1) 
components. 


Example 2.9.17. The empty graph on a finite set V has |V| many components 
(those are the singleton sets {v} for v € V). Thus, it is connected if and only 
if |V| =1. 
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Exercise 2.10. Let k € IN. Let S be a finite set. 

Recall that the Kneser graph Ks; is the simple graph whose vertices are 
the k-element subsets of S, and whose edges are the unordered pairs { A, B} 
consisting of two such subsets A and B that satisfy A N B = Ø. 

Prove that this Kneser graph Ks; is connected if |S| > 2k +1. 


[Remark: Can the “if” here be replaced by an “if and only if”? Not quite, 
because the graph Ks; is also connected if |S| = 2 and k = 1 (in which case 
it has two vertices and one edge), or if |S| = k (in which case it has only one 
vertex), or if k = 0 (in which case it has only one vertex). But these are the 
only “exceptions”.] 


2.9.7. Induced subgraphs on components 


The following is not hard to see: 


Proposition 2.9.18. Let G be a simple graph. Let C be a component of G. 
Then, the induced subgraph of G on the set C is connected. 


Proof. Let G [C] be this induced subgraph. We need to show that G [|C] is con- 
nected. In other words, we need to show that G |C] has exactly 1 component. 

Clearly, G [C] has at least one vertex (since C is a component, i.e., an equiv- 
alence class of ~g, but equivalence classes are always nonempty), thus has at 
least 1 component. So we only need to show that G [C] has no more than 1 
component. In other words, we need to show that any two vertices of G |C] are 
path-connected in G [C]. 

So let u and v be two vertices of G |C]. Then, u,v € C, and therefore u ~g 
v (since C is a component of G). In other words, there exists a walk w = 
(wo, W1,..., Wk) from u to v in G. We shall now prove that this walk w is 
actually a walk of G[C]. In other words, we shall prove that all vertices of w 
belong to C. 

But this is easy: If w; is a vertex of w, then (wWo,W1,...,W;) is a walk from 
u to w; in G, and therefore we have u ~g wij, so that w; belongs to the same 
component of G as u; but that component is C. Thus, we have shown that 
each vertex w; of w belongs to C. Therefore, w is a walk of the graph G [C]. 
Consequently, it shows that u ~¢jcj v- 

We have now proved that u ~cyjc) v for any two vertices u and v of G [C]. 
Hence, the relation ~¢jc) has no more than 1 equivalence class. In other words, 
the graph G [C] has no more than 1 component. This completes our proof. O 


In the following proposition, we are using the notation G [S] for the induced 
subgraph of a simple graph G on a subset S of its vertex set. 
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Proposition 2.9.19. Let G be a simple graph. Let C1, C2, . . ., Cg be all compo- 
nents of G (listed without repetition). 
Thus, G is isomorphic to the disjoint union G [C1] U G [C2] U--- U G [Cx]. 


Proof. Consider the bijection from V (G [C1] U G [C2] U - -- UG [C4]) to V (G) that 
sends each vertex (i, v) of G [C1] U G [C2] U - - - U G [Cx] to the vertex v of G. We 
claim that this bijection is a graph isomorphism. In order to prove this, we 
need to check that there are no edges of G that join vertices in different com- 
ponents. But this is easy: If two vertices in different components of G were 
adjacent, then they would be path-connected, and thus would actually belong 
to the same component. E 


The upshot of these results is that every simple graph can be decomposed 
into a disjoint union of its components (or, more precisely, of the induced sub- 
graphs on its components). Each of these components is a connected graph. 
Moreover, this is easily seen to be the only way to decompose the graph into a 
disjoint union of connected graphs. 


2.9.8. Some exercises on connectedness 


Exercise 2.11. Let G be a simple graph with V (G) # Ø. Show that the 
following two statements are equivalent: 


e Statement 1: The graph G is connected. 


e Statement 2: For every two nonempty subsets A and B of V (G) satisfy- 
ing ANB = Ø and AUB = V (G), there exista € A and b € B such 
that ab € E (G). (In other words: Whenever we subdivide the vertex set 
V (G) of G into two nonempty subsets, there will be at least one edge 
of G connecting a vertex in one subset to a vertex in another.) 


[Solution: This is Exercise 7 on homework set #1 from my Spring 2017 


course; see the course page for solutions.] 


Exercise 2.12. Let V be a nonempty finite set. Let G and H be two simple 
graphs such that V(G) = V(H) = V. Assume that for each u € V and 
v € V, there exists a path from u to v in G or a path from u to v in H. Prove 
that at least one of the graphs G and H is connected. 


[Solution: This is Exercise 8 on homework set #1 from my Spring 2017 


course; see the course page for solutions. ] 


Exercise 2.13. Let G = (V, E) be a simple graph. The complement graph G 
of G is defined to be the simple graph (V, P2(V) \ E). (Thus, two distinct 


An introduction to graph theory, version August 2, 2023 page 44 


vertices u and v in V are adjacent in G if and only if they are not adjacent in 
G.) 
Prove that at least one of the following two statements holds: 


e Statement 1: For each u € V and v € V, there exists a path from u to v 
in G of length < 3. 


e Statement 2: For each u € V and v € V, there exists a path from u to v 
in G of length < 2. 


[Solution: This is Exercise 9 on homework set #1 from my Spring 2017 


course; see the course page for solutions.] 


Exercise 2.14. Let n > 2 be an integer. Let G be a connected simple graph 
with n vertices. 


(a) Describe G if the degrees of the vertices of G are 1,1,2,2,...,2 (exactly 
two 1’s and n — 2 many 2’s). 


(b) Describe G if the degrees of the vertices of G are 1,1,...,1,n —1. 


(c) Describe G if the degrees of the vertices of G are 2,2,...,2. 


Here, to “describe” G means to explicitly determine (with proof) a graph 
that is isomorphic to G. 


The following exercise is not explicitly concerned with connectedness and com- 
ponents, but it might help to think about components to solve it (although there 
are solutions that do not use them): 


Exercise 2.15. Let G be a simple graph with n vertices. Assume that each 
vertex of G has at least one neighbor. 

A matching of G shall mean a set F of edges of G such that no two edges 
in F have a vertex in common. Let m be the largest size of a matching of G. 

An edge cover of G shall mean a set F of edges of G such that each vertex 
of G is contained in at least one edge e € F. Let c be the smallest size of an 
edge cover of G. 

Prove that c-+m =n. 


Remark 2.9.20. Let G be the cycle graph C5 shown in Example 
Then, {12,34} is a matching of G of largest possible size (why?), whereas 
{12,34,25} is an edge cover of G of smallest possible size (why?). Thus, 
Exercise [2.15|says that 2 + 3 = 5 here, which is indeed true. 
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2.10. Closed walks and cycles 


Here are two further kinds of walks: 
Definition 2.10.1. Let G be a simple graph. 


(a) A closed walk of G means a walk whose first vertex is identical with 
its last vertex. In other words, it means a walk (wọ, w1, ..., wp) with 
Wo = Wz. Sometimes, closed walks are also known as circuits (but 
many authors use this latter word for something slightly different). 


(b) A cycle of G means a closed walk (wo, w1,...,W,) such that k > 3 and 
such that the vertices wo, W1, .. ., Wg—1 are distinct. 


Example 2.10.2. Let G be the simple graph 
({1,2,3,4,5,6}, {12, 23, 34, 45, 56, 61, 13}). 


This graph looks as follows (we have already seen it in Example |2.9.2): 


Then: 


e The sequence (1,3,2,1,6,5,6,1) is a closed walk of G. But it is very 
much not a cycle. 


e The sequences (1,2,3,1) and (1,3,4,5,6,1) and (1,2,3,4,5,6,1) are cy- 
cles of G. You can get further cycles by rotating these sequences (in a 
proper sense of this word - e.g., rotating (1,2,3,1) gives (2,3,1,2) and 
(3,1,2,3)) and by reversing them. Every cycle of G can be obtained in 
this way. 


e The sequences (1) and (1,2,1) are closed walks, but not cycles of G 
(since they fail the k > 3 condition). 


e The sequence (1,2,3) is a walk, but not a closed walk, since 1 ¥ 3. 
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Authors have different opinions about whether (1,2,3,1) and (1,3,2,1) count 
as different cycles. Fortunately, this matters only if you want to count cycles, 
but not for the existence or non-existence of cycles. 

We have now defined paths (in an arbitrary graph) and also path graphs P,,; 
we have also defined cycles (in an arbitrary graph) and also cycle graphs Cy. 
Besides their similar names, are they related? The answer is “yes”: 


Proposition 2.10.3. Let G be a simple graph. 


(a) If (po, pı,--- pk) is a path of G, then there is a subgraph of 
G isomorphic to the path graph Pķ}ı, namely the subgraph 
({po, P1,---, Pe}, {PiPpi+1 | O<i<k}). (If this subgraph is actually 
an induced subgraph of G, then the path (po, pı,- --, pr) is called an 
“induced path”.) 


Conversely, any subgraph of G isomorphic to P41 gives a path of G. 
(b 


Å 


Now, assume that k > 3. If (co, c1,- --, Ck) is a cycle of G, then there is a 
subgraph of G isomorphic to the cycle graph Ck, namely the subgraph 
({co,C1,---,Ck}, {cici}y1 | 0< i< k}). (If this subgraph is actually an 
induced subgraph of G, then the cycle (co,c1,...,cg) is called an “in- 
duced cycle”.) 


Conversely, any subgraph of G isomorphic to C; gives a cycle of G. 


Proof. Straightforward. o 


Certain graphs contain cycles; other graphs don’t. For instance, the complete 
graph K, contains a lot of cycles (when n > 3), whereas the path graph P, 
contains none. Let us try to find some criteria for when a graph can and when 
it cannot have cycle 


Proposition 2.10.4. Let G be a simple graph. Let w be a walk of G such 
that no two adjacent edges of w are identical. (By “adjacent edges”, we 
mean edges of the form w;—1wW; and wjw;,1, where Wi—1, Wi, Wi41 are three 
consecutive vertices of w.) 

Then, w either is a path or contains a cycle (i.e., there exists a cycle of G 
whose edges are edges of w). 


Example 2.10.5. Let G be as in Example 2.10.2] Then, (2,1,3,2,1,6) is a walk 
w of G such that no two adjacent edges of w are identical (even though the 
edge 21 appears twice in this walk). On the other hand, (2,1,3,1,6) is not 
such a walk (since its two adjacent edges 13 and 31 are identical). 


7Mantel’s theorem already gives such a criterion for cycles of length 3 (because a cycle of 
length 3 is the same as a triangle). 
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Proof of Proposition We assume that w is not a path. We must then show 
that w contains a cycle. 

Write w as w = (wọ, W1,..., Wp). Since w is not a path, two of the vertices 
Wo, W1,...,Wx must be equal. In other words, there exists a pair (i, j) of integers 
i and j with i < j and w; = w;. Among all such pairs, we pick one with 
minimum difference j — i. We shall show that the walk (wi, Wi1,-- ., Wj) isa 
cycle. 

First, this walk is clearly a closed walk (since w; = w;). It thus remains to 
show that j —1 > 3 and that the vertices w;,wj+1,...,Wj-1 are distinct. The 
distinctness of Wi, Wi+1, . - ., Wj—1 follows from the minimality of j — i. To show 
that j — i > 3, we assume the contrary. Thus, j — i is either 1 or 2 (since i < j). 
But j — i cannot be 1, since the endpoints of an edge cannot be equal (since our 
graph is a simple graph). So j — i must be 2. Thus, w; = wj+2. Therefore, the 
two edges ww; and w;,1Wj+2 are identical. But this contradicts the fact that 
no two adjacent edges of w are identical. Contradiction, qed. O 


Corollary 2.10.6. Let G be a simple graph. Assume that G has a closed walk 
w of length > 0 such that no two adjacent edges of w are identical. Then, G 
has a cycle. 


Proof. This follows from Proposition [2.10.4} since w is not a path. o 


Theorem 2.10.7. Let G be a simple graph. Let u and v be two vertices in G. 
Assume that there are two distinct paths from u to v. Then, G has a cycle. 


Proof. More generally, we shall prove this theorem with the word “path” re- 
placed by “backtrack-free walk”, where a “backtrack-free walk” means a walk 
w such that no two adjacent edges of w are identical. This is a generalization 
of the theorem, since every path is a backtrack-free walk (why?). 

So we claim the following: 


Claim 1: Let p and q be two distinct backtrack-free walks that start 
at the same vertex and end at the same vertex. Then, G has a cycle. 


We shall prove Claim 1 by induction on the length of p. So we fix an integer 
N, and we assume that Claim 1 is proved in the case when the length of p is 
N — 1. We must now show that it is also true when the length of p is N. 

So let p = (po, P1,---,Pa) and q = (go, q1,- - -, qp) be two distinct backtrack- 
free walks that start at the same vertex and end at the same vertex and satisfy 
a = N. We must find a cycle. 

The walks p and q are distinct but start at the same vertex, so they cannot 
both be trivial. If one of them is trivial, then the other is a closed walk (because 
a trivial walk is a closed walk), and then our goal follows from Corollary [2.10.6] 


8We say that a walk is trivial if it has length 0. 
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in this case (because we have a nontrivial closed backtrack-free walk). Hence, 
from now on, we WLOG assume that neither of the two walks p and q is 
trivial. Thus, each of these two walks has a last edge. The last edge of p is 
Pa—1Pa, Whereas the last edge of q is qy_14p- 

Two cases are possible: 

Case 1: We have pa—1Pa = 4p—19b- 

Case 2: We have pa—1Pa £ Jo-19b- 

Let us consider Case 1 first. In this case, the last edges pa—1pa and qp—19p of 
the two walks p and q are identical, so the second-to-last vertices of these two 
walks must also be identical. Thus, if we remove these last edges from both 
walks, then we obtain two shorter backtrack-free walks (po, p1,..-,Pa—1) and 
(qo, 91,---,4p—1) that again start at the same vertex and end at the same vertex, 
but the length of the first of them is a— 1 = N — 1. Hence, by the induction 
hypothesis, we can apply Claim 1 to these two shorter walks (instead of p and 
q), and we conclude that G has a cycle. So we are done in Case 1. 

Let us now consider Case 2. In this case, we combine the two walks p and q 
(more precisely, p and the reversal of q) to obtain the closed walk 


(po, P1,-++,Pa—1, Pa = fb: Yb-1,--- ,9o) : 


This closed walk is backtrack-free (since (po, P1,---,Pa) and (qo, q1,- --, qp) are 
backtrack-free, and since pa—1Pa Æ 4y—19p) and has length > 0 (since it contains 
at least the edge p,_1pa). Hence, Corollary [2.10.6]entails that G has a cycle. 

We have thus found a cycle in both Cases 1 and 2. This completes the induc- 
tion step. Thus, we have proved Claim 1. As we said, Theorem [2.10.7] follows 
from it. 0 


Exercise 2.16. Let G be a simple graph. 


(a) Prove that if G has a closed walk of odd length, then G has a cycle of 
odd length. 


(b) Is it true that if G has a closed walk of length not divisible by 3, then G 
has a cycle of length not divisible by 3 ? 


(c) Does the answer to part (b) change if we replace “walk” by “non- 
backtracking walk”? (A walk w with edges e1,e2,...,e, (in this or- 
der) is said to be non-backtracking if each i € {1,2,...,k — 1} satisfies 


ei F €41-) 


(d) A trail (in a graph) means a walk whose edges are distinct (but whose 
vertices are not necessarily distinct). Does the answer to part (b) change 
if we replace “walk” by “trail”? 


(Proofs and counterexamples should be given.) 
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2.11. The longest path trick 


Here is another proposition that guarantees the existence of cycles in a graph 
under certain circumstances. More importantly, its proof illustrates a useful 
tactic in dealing with graphs: 


Proposition 2.11.1. Let G be a simple graph with at least one vertex. Let 
d > 1 be an integer. Assume that each vertex of G has degree > d. Then, G 
has a cycle of length > d+ 1. 


Proof. Let p = (vo, v1,...,Um) be a longest path of G. (Why does G have a 
longest path? Let’s see: Any path of G has length < |V| — 1, since its vertices 
have to be distinct. Moreover, G has at least one vertex and thus has at least 
one path. A finite nonempty set of integers has a largest element. Thus, G has 
a longest path.) 

The vertex vo has degree > d (by assumption), and thus has > d neighbors 
(since the degree of a vertex is the number of its neighbors). 

If all neighbors of vg belonged to the set {v1,02,...,0g_1} P) then the 
number of neighbors of v9 would be at most d — 1, which would contradict the 
previous sentence. Thus, there exists at least one neighbor u of vp that does 
not belong to this set {v1,02,...,0g_1}. Consider this u. Then, u Æ vo (since a 
vertex cannot be its own neighbor). 

Attaching the vertex u to the front of the path p, we obtain a walk 

p' := (U, vo, V1,- --,Um) - 
If we had u ¢ {v0,0},..-,Um}, then this walk p’ would be a path; but this 
would contradict the fact that p is a longest path of G. Thus, we must have 
u € {V0, v1, ..-, Um}. In other words, u = v; for some i € {0,1,...,m}. Consider 
this i. Since u Æ vo and u ¢ {v,U2,...,0g_1}, we thus have i > d. Here is a 
picture: 


oa m e-o) 7 


Now, consider the walk 
C := (U, vo, Vis.. -, Dy) 


This is a closed walk (since u = v;) and has length i +1 > d+ 1 (since i > d). If 
we can show that c is a cycle, then we have thus found a cycle of length > d +1, 
so we will be done. 


°If d — 1 > m, then this set should be understood to mean {01,U2,..-,Um}.- 
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It thus remains to prove that c is a cycle. Let us do this. We need to check 
that the vertices u, vo, v1, ...,V;—1 are distinct, and that the length of c is > 3. 
The latter claim is clear: The length of c isi+1>d+1 > 3 (since d > 1 
and d € Z). The former claim is not much harder: Since u = v;, the vertices 
U,U0,U1,-.-,Uj-1 are just the vertices vj,V0,01,...,Vij-1, and thus are distinct 
because they are distinct vertices of the path p. The proof of Proposition [2.11.1 
is thus complete. O 


2.12. Bridges 


One question that will later prove crucial is: What happens to a graph if we 
remove a single edge from it? Let us first define a notation for this: 


Definition 2.12.1. Let G = (V,E) be a simple graph. Let e be an edge of G. 
Then, G \ e will mean the graph obtained from G by removing this edge e. 
In other words, 

G\e:= (V, E \{e}). 


Some authors write G — e for G \ e. 


Theorem 2.12.2. Let G be a simple graph. Let e be an edge of G. Then: 


(a) If e is an edge of some cycle of G, then the components of G \ e are 
precisely the components of G. (Keep in mind that the components are 
sets of vertices. It is these sets that we are talking about here, not the 
induced subgraphs on these sets.) 


(b) If e appears in no cycle of G (in other words, there exists no cycle of G 
such that e is an edge of this cycle), then the graph G \ e has one more 
component than G. 


Example 2.12.3. Let G be the graph shown in the following picture: 


(1) 


(where we have labeled the edges a and b for further reference). This graph 
has 4 components. The edge a is an edge of a cycle of G, whereas the edge 
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b appears in no cycle of G. Thus, if we set e = a, then Theorem [2.12.2] (a) 
shows that the components of G \ e are precisely the components of G. This 
graph G \ e for e = a looks as follows: 


b 


and visibly has 4 components. On the other hand, if we set e = b, then 
Theorem [2.12.2](b) shows that the graph G \ e has one more component than 
G. This graph G \ e for e = b looks as follows: 


and visibly has 5 components. 


Proof of Theorem|2.12.2| We will only sketch the proof. For details, see [21f6 
§6.7]. 

Let u and v be the endpoints of e, so that e = uv. Note that (u,v) is a path of 
G, and thus we have u ~g v. 


(a) Assume that e is an edge of some cycle of G. Then, if you remove e from 
this cycle, then you still have a path from u to v left (as the remaining edges of 
the cycle function as a detour), and this path is a path of G \ e. Thus, u ~¢\, V. 

Now, we must show that the components of G \ e are precisely the compo- 
nents of G. This will clearly follow if we can show that the relation ~¢\, is 
precisely the relation ~g (because the components of a graph are the equiva- 
lence classes of its ~ relation). So let us prove the latter fact. 

We must show that two vertices x and y of G satisfy x ~¢\, y if and only if 
they satisfy x ~g y. The “only if” part is obvious (since a walk of G \ e is always 
a walk of G). It thus remains to prove the “if” part. So we assume that x and y 
are two vertices of G satisfying x ~ç y, and we want to show that x ~¢\,¢ Y. 
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From x ~c y, we conclude that G has a path from x to y (by Proposition 
2.9.10). If this path does not usd! the edge e, then it is a path from x to y in 
G \e, and thus we have x ~c¢\, y, which is what we wanted to prove. So we 
WLOG assume that this path does use the edge e. Thus, this path contains the 
endpoints u and v of this edge e. We WLOG assume that u appears before v on 
this path (otherwise, just swap u with v). Thus, this path looks as follows: 


(Rasy hy Ocal) 
If we remove the edge e = uv, then this path breaks into two smaller paths 
(xX,..., ll) and (Cassi) 


(since the edges of a path are distinct, so e appears only once in it). Both of 
these two smaller paths are paths of G\e. Thus, x ~gq\-. u and v ~G\o yY- 
Now, recalling that ~¢\, is an equivalence relation, we combine these results to 
obtain 


X ~G\e U ~G\e Y ~G\e Y- 
Hence, x ~¢\, y. This completes the proof of Theorem 2.12.2] (a). 


(b) Assume that e appears in no cycle of G. We must prove that the graph G \ 
e has one more component than G. To do so, it suffices to show the following: 


Claim 1: The component of G that contains u and v (this component 
exists, since u ~ç v) breaks into two components of G \ e when the 
edge e is removed. 


Claim 2: All other components of G remain components of G \ e. 


Claim 2 is pretty clear: The components of G that don’t contain u and v do 
not change at all when e is removed (since they contain neither endpoint of e). 
Thus, they remain components of G \ e. (Formalizing this is a nice exercise in 
formalization; see §6.7].) 


It remains to prove Claim 1. We introduce some notations: 


e Let C be the component of G that contains u and v. 
e Let A be the component of G \ e that contains u. 


e Let B be the component of G \ e that contains v. 


Then, we must show that A U B = C and AQ B = Ø. 

To see that A N B = Ø, we need to show that u ~g\e v does not hold (since A 
and B are the equivalence classes of u and v with respect to the relation ~¢\,). 
So let us do this. Assume the contrary. Thus, u ~¢\, v. Hence, there exists a 


10We say that a walk w uses an edge f if f is an edge of w. 
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path from u to v in G \ e. Since e = uv, we can “close” this path by appending 
the vertex u to its end; the result is a cycle of the graph G that contains the 
edge e. But this contradicts our assumption that no cycle of G contains e. This 
contradiction shows that our assumption was wrong. Thus, we conclude that 
U Gje V does not hold. Hence, as we said, AN B = Ø. 

It remains to show that A U B = C. Since A and B are clearly subsets of C 
(because each walk of G \ e is a walk of G, and thus each component of G \ e 
is a subset of a component of G), we have A U B C C, and therefore we only 
need to show that C C A UB. In other words, we need to show that each c € C 
belongs to A U B. 

Let us show this. Let c € C be a vertex. Then, c œg u (since C is the 
component of G containing u). Therefore, G has a path p from c to u. Consider 
this path p. Two cases are possible: 


e Case 1: This path p does not use the edge e. In this case, p is a path of 
G \ e, and thus we obtain c ~¢\, u. In other words, c € A (since A is the 
component of G \ e containing u). 


e Case 2: This path p does use the edge e. In this case, the edge e must be 
the last edge of p (since the path p would otherwise contain the vertex u 
twicd!1 but a path cannot contain a vertex twice), and the last two vertices 
of p must be v and u in this order. Thus, by removing the last vertex from 
p, we obtain a path from c to v, and this latter path is a path of G \ e (since 
it no longer contains u and therefore does not use e). This yields c ~¢\, v. 
In other words, c € B (since B is the component of G \ e containing v). 


In either of these two cases, we have shown that c belongs to one of A and B. 
In other words, c € AU B. This is precisely what we wanted to show. This 
completes the proof of Theorem [2.12.2] (b). O 


We introduce some fairly standard terminology: 
Definition 2.12.4. Let e be an edge of a simple graph G. 


(a) We say that e is a bridge (of G) if e appears in no cycle of G. 


(b) We say that e is a cut-edge (of G) if the graph G \ e has more compo- 
nents than G. 


Corollary 2.12.5. Let e be an edge of a simple graph G. Then, e is a bridge if 
and only if e is a cut-edge. 


Proof. Follows from Theorem [2.12.2 oO 


Indeed, the path p already ends in u. If it would contain e anywhere other than at the very 
end, then it would thus contain the vertex u twice (since u is an endpoint of e). 
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We can also define “cut-vertices”: A vertex v of a graph G is said to be a cut- 
vertex if the graph G \ v (that is, the graph G with the vertex v removed!) has 
more components than G. Unfortunately, there doesn’t seem to be an analogue 
of Corollary for cut-vertices. Note also that removing a vertex (unlike 
removing an edge) can add more than one component to the graph (or it can 
also subtract 1 component if this vertex had degree 0). For example, removing 
the vertex 0 from the graph 


results in an empty graph on the set {1,2,3,4}, so the number of components 
has increased from 1 to 4. 


2.13. Dominating sets 
2.13.1. Definition and basic facts 


Here is another concept we can define for a graph: 


Definition 2.13.1. Let G = (V,E) bea simple graph. 

A subset U of V is said to be dominating (for G) if it has the following 
property: Each vertex v € V \ U has at least one neighbor in U. 

A dominating set for G (or dominating set of G) will mean a subset of V 
that is dominating. 


Example 2.13.2. Consider the cycle graph 


C5 = ({1,2,3,4,5$, {12, 23, 34, 45, 51}) = o 


12When we remove a vertex, we must of course also remove all edges that contain this vertex. 
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The set {1,3} is a dominating set for Cs, since all three vertices 2,4,5 that 
don’t belong to {1,3} have neighbors in {1,3}. The set {1,5} is not a dom- 
inating set for Cs, since the vertex 3 has no neighbor in {1,5}. There is no 
dominating set for C5 that has size 0 or 1, but there are several of size 2, and 
every subset of size > 3 is dominating. 


Here are some more examples: 


e If G = (V,E) is a simple graph, then the whole vertex set V is always 
dominating, whereas the empty set © is dominating only when V = Ø. 


e If G = (V,E) is a complete graph, then any nonempty subset of V is 
dominating. 


e If G = (V,E) is an empty graph, then only V is dominating. 


Clearly, the “denser” a graph is (i.e., the more edges it has), the “easier” it 
is for a set to be dominating. Often, a graph is given, and one is interested in 
finding a dominating set of the smallest possible sizd'4 As the case of an empty 
graph reveals, sometimes the only choice is the whole vertex set. However, in 
many cases, we can do better. Namely, we need to require that the graph has 
no isolated vertices: 


Definition 2.13.3. Let G be a simple graph. A vertex v of G is said to be 
isolated if it has no neighbors (i.e., if degv = 0). 


An isolated vertex has to belong to every dominating set (since otherwise, 
it would need a neighbor in that set, but it has no neighbors). Thus, isolated 
vertices do not contribute much to the study of dominating sets, other than 
inflating their size. Therefore, when we look for dominating sets, we can restrict 
ourselves to graphs with no isolated vertices. There, we have the following 
result: 


Proposition 2.13.4. Let G = (V,E) be a simple graph that has no isolated 
vertices. Then: 


(a) There exists a dominating subset of V that has size < |V| /2. 


(b) There exist two disjoint dominating subsets A and B of V such that 
AUB=V. 


13Supposedly, this has applications in mobile networking: For example, you might want to 
choose a set of routers in a given network so that each node is either a router or directly 
connected (i.e., adjacent) to one. 
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One proof of this proposition will be given in Exercise [2.19|below (homework 
set #2 exercise 4). Another appears in §3.6]. 

For specific graphs, the bound |V| /2 in Proposition (a) can often be 
improved. Here is an example: 


Exercise 2.17. Let n > 3 be an integer. Find a formula for the smallest size 
of a dominating set of the cycle graph C,. You can use the ceiling function 
x ++ [x], which sends a real number x to the smallest integer that is > x. 


Exercise 2.18. Let n and k be positive integers such that n > k(k+1) and 
k > 1. Recall (from Subsection 2.6.3) the Kneser graph KG, x, whose vertices 
are the k-element subsets of {1,2,...,n}, and whose edges are the unordered 
pairs {A,B} of such subsets with AN B = Ø. 

Prove that the minimum size of a dominating set of KG, is k +1. 


Exercise 2.19. Let G = (V,E) be a connected simple graph with at least two 
vertices. 

The distance d (v, w) between two vertices v and w of G is defined to be 
the smallest length of a path from v to w. (In particular, d (v, v) = 0 for each 
ve V,) 

Fix a vertex v € V. Define two subsets 


A= {w€ V |d(v,w) is even} and B = {w € V | d(v,w) is odd} 
of V. 


(a) Prove that A is dominating. 
(b) Prove that B is dominating. 
(c) Prove that there exists a dominating set of G that has size < |V| /2. 


(d) Prove that the claim of part (c) holds even if we don’t assume that G is 
connected, as long as we assume that each vertex of G has at least one 
neighbor. (In other words, prove Proposition [2.13.4] (a).) 


2.13.2. The number of dominating sets 


Next, we state a rather surprising recent result about the number of dominating 
sets of a graph: 


Theorem 2.13.5 (Brouwer’s dominating set theorem). Let G be a simple 
graph. Then, the number of dominating sets of G is odd. 
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Three proofs of this theorem are given in Brouwer’s note [Brouwe09] [4 Let 
me show the one I like the most. We first need a notation: 


Definition 2.13.6. Let G = (V,E) bea simple graph. A detached pair will 
mean a pair (A, B) of two disjoint subsets A and B of V such that there exists 
no edge ab € E witha € A and b € B. 


Example 2.13.7. Consider the cycle graph 


Co = ({1,2,3,4,5,6}, {12, 23, 34, 45, 56, 61}) = G) (6) 


Then, ({1,2},{4,5}) is a detached pair, whereas ({1,2}, {3,4}) is not (since 
23 is an edge). Of course, there are many other detached pairs; in particular, 
any pair of the form (Ø, B) or (A, Ø) is detached. 


Let me stress that the word “pair” always means “ordered pair” unless I say 
otherwise. So, if (A,B) is a detached pair, then (B, A) is a different detached 
pair, unless A = B = ©. 

Here is an attempt at a proof of Theorem It is a nice example of how 
to apply known results to new graphs to obtain new results. The only problem 
is, it shows a result that is a bit at odds with the claim of the theorem... 


Proof of Theorem 2.13.5} attempt 1. Write the graph G as (V, E). 

Recall that P (V) denotes the set of all subsets of V. 

Construct a new graph H with the vertex set P (V) as follows: Two subsets 
A and B of V are adjacent as vertices of H if and only if (A,B) is a detached 
pair. (Note that if the original graph G has n vertices, then this graph H has 2” 
vertices. It is huge!) 

I claim that the vertices of H that have odd degree are precisely the subsets 
of V that are dominating. In other words: 


Claim 1: Let A be a subset of V. Then, the vertex A of H has odd 
degree if and only if A is a dominating set of G. 


14Other proofs can be found in the AoPS thread https ://artofproblemsolving.com/community/c6h358772p1960068 


. (This thread is concerned with a superficially different contest problem, but the latter 
problem is quickly revealed to be Theorem |2.13.5]in a number-theoretical disguise.) 
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[Proof of Claim 1: We let N (A) denote the set of all vertices of G that have a 
neighbor in A. (This may or may not be disjoint from A.) 

The neighbors of A (as a vertex in H) are precisely the subsets B of V such 
that (A, B) is a detached pair (by the definition of H). In other words, they are 
the subsets B of V that are disjoint from A and also have no neighbors in A (by 
the definition of a “detached pair”). In other words, they are the subsets B of 
V that are disjoint from A and also disjoint from N (A). In other words, they 
are the subsets of the set V \ (AU N(A)). Hence, the number of such subsets 
B is 2|V\(AUN(A))|, 

The degree of A (as a vertex of H) is the number of neighbors of A in H. 
Thus, this degree is 2'”\(AUN(A))| (because we have just shown that the num- 
ber of neighbors of A is 2!”Y\(AUN(A))I), But 2* is odd if and only if k = 0. 
Thus, we conclude that the degree of A (as a vertex of H) is odd if and only if 
IV \ (AUN (A))| = 0. The condition |V \ (AUN (A))| = 0 can be rewritten as 
follows: 


(IV \ (AUN (4))| =0) 
<> (V\(AUN(A)) 
<> (V C AUN(A)) 

<> (V\ACN(A)) 

<= (each vertex v € V \ A belongs to N (A)) 
<=> (each vertex v € V \ A has a neighbor in A) 
( 


<=> (A is dominating) (by the definition of “dominating”) . 


Thus, what we have just shown is that the degree of A (as a vertex of H) is odd 
if and only if A is dominating. This proves Claim 1.] 


Claim 1 shows that the vertices of H that have odd degree are precisely the 
dominating sets of G. But the handshake lemma (Corollary tells us that 
any simple graph has an even number of vertices of odd degree. Applying this 
to H, we conclude that there is an even number of dominating sets of G. 

Huh? We want to show that there is an odd number of dominating sets of G, 
not an even number! Why did we just get the opposite result? 

Puzzle: Find the mistake in our above reasoning! The answer will be revealed 
on the next page. o 
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So what was the mistake in our reasoning? 

The mistake is that our definition of H requires the vertex Ø of H to be 
adjacent to itself (since (Ø, ) is a detached pair); but a vertex of a simple 
graph cannot be adjacent to itself. So we need to tweak the definition of H 
somewhat: 


Correction of the above proof of Theorem(2.13.5| Define the graph H as above, but 
do not try to have © adjacent to itself. (This is the only vertex that creates any 
trouble, because a detached pair (A, B) cannot satisfy A = B unless both A and 
B are Ø.) 

We WLOG assume that V # Ø (otherwise, the claim is obvious). Thus, the 
empty set Ø is not dominating. 

Our Claim 1 needs to be modified as follows: 


Claim 1’: Let A be a subset of V. Then, the vertex A of H has odd 
degree if and only if A is empty or a dominating set of G. 


This can be proved in the same way as we “proved” Claim 1 above; we just 
need to treat the A = © case separately now (but this case is easy: Ø is adjacent 
to all other vertices of H, and thus has degree 2!”! — 1, which is odd). 

So we conclude (using the handshake lemma) that the number of empty or 
dominating sets is even. Subtracting 1 for the empty set, we conclude that the 
number of dominating sets is odd (since the empty set is not dominating). This 
proves Brouwer’s theorem (Theorem |2.13.5). O 


There are other ways to prove Brouwer’s theorem as well. A particularly 
nice one was found by Irene Heinrich and Peter Tittmann in 2017; they gave 
an “explicit” formula for the number of dominating sets that shows that this 
number is odd ([HeiTit17, Theorem 8], restated using the language of detached 
pairs): 


Theorem 2.13.8 (Heinrich-Tittmann formula). Let G = (V,E) be a simple 
graph with n vertices. Assume that n > 0. 

Let a be the number of all detached pairs (A,B) such that both numbers 
|A| and |B| are even and positive. 

Let 6 be the number of all detached pairs (A,B) such that both numbers 
|A| and |B| are odd. 

Then: 


(a) The numbers « and $ are even. 


(b) The number of dominating sets of G is 2” — 1 + a — $. 


Part (a) of this theorem is obvious (recall that if (A, B) is a detached pair, then 
so is (B, A)). Part (b) is the interesting part. In [17s] §3.3-§3.4], I give a long but 
elementary proof. 
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More recently ([Heilit18]), Heinrich and Tittmann have refined their formula 
to allow counting dominating sets of a given size. Their main result is the 
following formula (exercise 5 on homework set #2): 


Exercise 2.20. Let G = (V, E) be a simple graph with at least one vertex. Let 
n = |V|. A detached pair means a pair (A, B) of two disjoint subsets A and 
B of V such that there exists no edge ab € E witha € A and b € B. 

Prove the following generalization of the Heinrich—Tittmann formula: 


L l= pa =a - (—1)/4! xlBl, 


S is a dominating (A,B) is a detached pair; 
set of G A#@; BAD 


(Here, both sides are polynomials in a single indeterminate x with coeffi- 
cients in Z.) 


[Hint: This is a generalization of the Heinrich—Tittmann formula for the 
number of dominating sets. (The latter formula can be obtained fairly easily 
by substituting x = 1 into the above and subsequently cancelling the addends 
with |A| # |B|mod2 against each other.) You are free to copy arguments 
from and change whatever needs to be changed. (Some lemmas can 
even be used without any changes — they can then be cited without proof.)] 


The following exercise gives a generalization of Theorem [2.13.5] (to recover 
Theorem [2.13.5] from it, set k = 1): 


Exercise 2.21. Let k be a positive integer. Let G = (V,E) be a simple graph. 
A subset U of V will be called k-path-dominating if for every v € V, there 
exists a path of length < k from v to some element of U. 

Prove that the number of all k-path-dominating subsets of V is odd. 


[Hint: This is not as substantial a generalization as it may look. The short- 
est proof is very short.] 


[Solution: This is Exercise 6 on homework set #1 from my Spring 2017 


course; see the course page for solutions.] 


2.14. Hamiltonian paths and cycles 
2.14.1. Basics 


Now to something different. Here is a quick question: Given a simple graph G, 
when is there a closed walk that contains each vertex of G ? 

The answer is easy: When G is connected. Indeed, if a simple graph G is 
connected, then we can label its vertices by v1, v2,...,Un arbitrarily, and we 
then get a closed walk by composing a walk from vı to v2 with a walk from 
V2 to v3 with a walk from v3 to v4 and so on, ending with a walk from v, to 
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vı. This closed walk will certainly contain each vertex. Conversely, such a walk 
cannot exist if G is not connected. 

The question becomes a lot more interesting if we replace “closed walk” by 
“path” or “cycle”. The resulting objects have a name: 


Definition 2.14.1. Let G = (V,E) bea simple graph. 


(a) A Hamiltonian path in G means a walk of G that contains each vertex 
of G exactly once. Obviously, it is a path. 


(b) A Hamiltonian cycle in G means a cycle (vo, 01,..., 0%) of G such that 
each vertex of G appears exactly once among vo, 01,... ,Uk—1- 


Some graphs have Hamiltonian paths; some don’t. Having a Hamiltonian 
cycle is even stronger than having a Hamiltonian path, because if (vo, 01,..., 0x) 
is a Hamiltonian cycle of G, then (vo, 01,...,0,—1) is a Hamiltonian path of G. 


Convention 2.14.2. In the following, we will abbreviate: 


e “Hamiltonian path” as “hamp”; 


e “Hamiltonian cycle” as “hamc”. 
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Example 2.14.3. Which of the following eight graphs have hamps? Which 
have hamcs? 


Answers: 
e The graph A has a hamc (1,2,3,4,5,6,1), and thus a hamp 
(1,2,3,4,5,6). (Recall that a graph that has a hamc always has a hamp, 


since we can simply remove the last vertex from a hamc to obtain a 


hamp.) 
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The graph B has a hamp (2,3,1,4,5,6), but no hamc. The easiest way 
to see that B has no hamc is the following: The edge 14 is a cut-edge 
(i.e., removing it renders the graph disconnected), thus a bridge (i.e., an 
edge that appears in no cycle); therefore, any cycle must stay entirely 
“on one side” of this edge. 


The graph C has a hamp (0,1,2,3), but no hamc. The argument for the 
non-existence of a hamc is the same as for B: The edge 01 is a bridge. 


The graph D has neither a hamp nor a hamc, because it is not con- 
nected. Only a connected graph can have a hamp. 


The graph E has a hamp (0,3,2,1,6,5,4), but no hamc (checking this 
requires some work, though). 


The graph F has a hamc (1, 2,3,4,8,7,6,5,1), thus also a hamp. 


The graph G has a hame (1,2,3,4,5,5',4’,3’,2',1',1), thus also a hamp. 


The graph H (which, by the way, is isomorphic to the Petersen graph 
from Subsection [2.6.3) has a hamp (1,3,5,2,4,4’,3’,2',1',5), but no 


hamc (but this is not obvious! see the Wikipedia article for an argu- 


ment). 


In general, finding a hamp or a hamc, or proving that none exists, is a hard 
problem. It can always be solved by brute force (i.e., by trying all lists of distinct 
vertices and checking if there is a hamp among them, and likewise for hamcs), 
but this quickly becomes forbiddingly laborious as the size of the graph in- 
creases. Some faster algorithms exist (in particular, there is one of running time 
O (n*2"), where n is the number of vertices), but no polynomial-time algorithm 
is known. The problem (both in its hamp version and in its hamc version) 
is known to be NP-hard (in the language of complexity theory). In practice, 
hamps and hamcs can often be found with some wit and perseverance; proofs 
of their non-existence can often be obtained with some logic and case analysis 
(see the above example for some sample arguments). See the Wikipedia page 
for “Hamiltonian path problem” for more information. 

The problem of finding hamps is related to the so-called “traveling salesman 
problem” (TSP), which asks for a hamp with “minimum weight” in a weighted 
graph (each edge has a number assigned to it, which is called its “weight”, and 
the weight of a hamp is the sum of the weights of the edges it uses). There is a 
lot of computer-science literature about this problem. 
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2.14.2. Sufficient criteria: Ore and Dirac 


We shall now show some necessary criteria and some sufficient criteria (but no 
necessary-and-sufficient criteria) for the existence of hamps and hamcs. Here 
is the most famous sufficient criterion: 


Theorem 2.14.4 (Ore’s theorem). Let G = (V,E) be a simple graph with n 
vertices, where n > 3. 
Assume that deg x + deg y > n for any two non-adjacent vertices x and y. 
Then, G has a hamce. 


There are various proofs of this theorem scattered around; see [Harju14, The- 
orem 3.6] or [Guichal6, Theorem 5.3.2]. We shall give another proof (following 


the “Algorithm” section on the Wikipedia page for “Ore’s theorem”): 


Proof of Theorem 2.14.4) A listing (of V) shall mean a list of elements of V that 
contains each element exactly once. It must clearly be an n-tuple. 

The hamness of a listing (01, ¥2,...,0n) will mean the number of all i € 
{1,2,...,n} such that vjv;,1 € E. Here, we set 0,41 = v1. (Visually, it is best 
to represent a listing (v1, v2, ..., Un) by drawing the vertices v1, v2, ..., Un ona 
circle in this order. Its hamness then counts how often two successive vertices 
on the circle are adjacent in the graph G.) Note that the hamness of a listing 
(01, 02,..-,0n) does not change if we cyclically rotate the listing (i.e., transform 
it into (v2, U3,..-,Un,V1))- 

Clearly, if we can find a listing (v1,02,...,0n) of hamness > n, then all of 
V1U2, V203, ..., UnV1 are edges of G, and thus (01, v2, . . ., Un, v1) is a hamc of G. 
Thus, we need to find a listing of hamness > n. 

To do so, I will show that if you have a listing of hamness < n, then you can 
slightly modify it to get a listing of larger hamness. In other words, I will show 
the following: 


Claim 1: Let (v1, v2,...,Un) be a listing of hamness k < n. Then, 
there exists a listing of hamness larger than k. 


[Proof of Claim 1: Since the listing (v1, v2,...,0n) has hamness k < n, there 
exists some į € {1,2,...,n} such that v;v;,, ¢ E. Pick such an i. Thus, the 
vertices v; and vj, of G are non-adjacent. The “deg x + deg y > n” assumption 
of the theorem thus yields deg (v;) + deg (vj41) > n. 

However, 


deg (v;) = |{w EV | vw € E}| 
= EA | EEH 
= |{j € {1,2,... n} \ {i} | v0; € E}| 
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(because j = 7 could not satisfy v;v; € E anyway) and 


deg (vi41) = {w E V | viw € E}| 
= |{j € E E 
( since (V2,03,.--,0n41) is a listing of V ) 
(because 0,41 = V1) 
NGS. i Saw aer) 


(because j = i could not satisfy v;+1vj+1 E€ E anyway). In light of these two 
equalities, we can rewrite the inequality deg (v;) + deg (vj41) > n as 


{j € {1,2,...,n} \ {i} | vw; € E}| 
+e thee). a SE} |S 


Thus, the two subsets {j € {1,2,...,n}\ {i} | viw; € E} and 
{j € {1,2,...,n} \ {i} | vij E E} of the (n — 1)-element set {1,2,...,n} \ 
{i} have total size > n (that is, the sum of their sizes is > n). Hence, these two 
subsets must overlap (i.e., have an element in common). In other words, there 
exists a j € {1,2,...,n} \ {i} that satisfies both v;v; € E and 0;410;+1 € E. Pick 
such a j. 

Now, consider a new listing obtained from the old listing (v1,02,...,0n) as 
follows: 


e First, cyclically rotate the old listing so that it begins with v;,1. Thus, you 
get the listing (0/41, Vi+2,. - -, Un, U1, V2,- --, Vi). 


e Then, reverse the part of the listing starting at v;,, and ending at v;. Thus, 
you get the new listing 


Oj, Oj-11 +++ Vj41 1 Uj4ds Oj4 27+ ++ Oj 
a 
This is the reversed part; This is the part that 
it may or may not “wrap around” was not reversed. 


(i.e., contain ...,01,0y,... somewhere). 


This is the new listing we want. 


I claim that this new listing has hamness larger than k. Indeed, rotating the 
old listing clearly did not change its hamness. But reversing the part from vj+1 
to v; clearly did: After the reversal, the edges vj0;,1 and vjvj+1 no longer count 
towards the hamness (if they were edges to begin with), but the edges v;v; and 
j+10;+1 Started counting towards the hamness. This is a good bargain, because 
it means that the hamness gained +2 from the newly-counted edges v;v; and 
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0j410j41 (which, as we know, both exist), while only losing 0 or 1 (since the 
edge v;0;,1 did not exist, whereas the edge vjv;,; may or may not have been 
lost). Thus, the hamness of the new listing is larger than the hamness of the 
old listing either by 1 or 2. In other words, it is larger than m by at least 1 or 2. 
This proves Claim 1.] 


Now, we can start with any listing of V and keep modifying it using Claim 1, 
increasing its hamness each time, until its hamness becomes > n. But once its 
hamness is > n, we have found a hamce (as explained above). Theorem [2.14.4]is 
thus proven. Oo 


Corollary 2.14.5 (Dirac’s theorem). Let G = (V,E) be a simple graph with n 
vertices, where n > 3. 


Assume that deg x > 5 for each vertex x € V. 
Then, G has a hamc. 


Proof. Follows from Ore’s theorem, since any two vertices x and y of G satisfy 


n n 
degx +degy > 5+5 =n. O 
we Wa 2 2 
n n 
>= >= 
2 2 


Exercise 2.22. 


(a) Let G = (V,E) be a simple graph, and let u and v be two distinct 
vertices of G that are not adjacent. Let n = |V|. Assume that deg u + 
degv > n. Let G' = (V,EU {uv}) be the simple graph obtained from 
G by adding a new edge uv. Assume that G’ has a hamc. Prove that G 
has a hamc. 


(b) Does this remain true if we replace “hamc” by “hamp”? 


2.14.3. A necessary criterion 


So much for sufficient criteria. What about necessary criteria? 


Proposition 2.14.6. Let G = (V, E) be a simple graph. 

For each subset S of V, we let G \ S be the induced subgraph of G on the 
set V \ S. (In other words, this is the graph obtained from G by removing all 
vertices in S and removing all edges that have at least one endpoint in S.) 
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(For example, if and S = {3,6}, then 


m Teb 


Also, we let bo (H) denote number of connected components of a sim- 
ple graph H. 

(a) If G has a hamc, then every nonempty S C V satisfies bọ (G \ S) < |S]. 

(b) If G has a hamp, then every S C V satisfies bọ (G \ S) < |S| +1. 


For example, part (a) of this proposition shows that the graph E from Exam- 
ple [2.14.3] has no hamc, because if we take S to be {3,6}, then bo (G \ S) =3 
whereas |S| = 2. Thus, the proposition can be used to rule out the existence of 
hamps and hamcs in some cases. 


Proof of Proposition [2.14.6] (a) Let S C V be a nonempty set. If we cut |S| many 
vertices out of a cycle, then the cycle splits into at most |S| paths: 


226 


remove the vertices 
marked with daggers 
=F 


O 


Of course, our graph G itself may not be a cycle, but if it has a hamc, then the 
removal of the vertices in S will split the hamc into at most |S| paths (according 
to the preceding sentence), and thus the graph G \ S will have < |S| many 
components (just using the surviving edges of the hamc alone). Taking into 
account all the other edges of G can only decrease the number of components. 
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(b) This is analogous to part (a). E 


This proposition often (but not always) gives a quick way of convincing your- 
self that a graph has no hamc or hamp. Alas, its converse is false. Case in point: 
The Petersen graph (defined in Subsection 2.6.3) has no hamc, but it does satisfy 
the “every nonempty S C V satisfies bg (G \ S) < |S|” condition of Proposition 
(a). 


2.14.4. Hypercubes 


Now, let us move on to a concrete example of a graph that has a hamce. 


Definition 2.14.7. Let n € IN. The n-hypercube Qn, (more precisely, the n-th 
hypercube graph) is the simple graph with vertex set 


{0,1}" = {(a1,a2,...,dn) | each a; belongs to {0,1}} 


and edge set defined as follows: A vertex (a@1,42,...,an) € {0,1}" is adjacent 
to a vertex (b1, b2,...,bn) € {0,1}" if and only if there exists exactly one 
i € {1,2,...,n} such that a; 4 b;. (For example, in Q,, the vertex (0,1, 1,0) is 
adjacent to (0,1,0,0).) 

The elements of {0,1}” are often called bitstrings (or binary words), and 
their entries are called their bits (or letters). So two bitstrings are adjacent in 
Qn if and only if they differ in exactly one bit. 

We often write a bitstring (41,42, ...,an) as a1a2:--an. (For example, we 
write (0,1,1,0) as 0110.) 
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Example 2.14.8. Here is how the n-hypercubes Q, look like for n = 1, 2,3: 


This should explain the name “hypercube”. The 0-hypercube Qo is a graph 
with just one vertex (namely, the empty bitstring ()). 


| Theorem 2.14.9 (Gray). Let n > 2. Then, the graph Qn has a hamce. 


Such hamcs are known as Gray codes. They are circular lists of bitstrings of 
length n such that two consecutive bitstrings in the list always differ in exactly 


one bit. See the Wikipedia article on “Gray codes” for applications. 


Proof of Theorem |2.14.9| We will show something stronger: 


Claim 1: For each n > 1, the n-hypercube Q;, has a hamp from 
00---0 to 100---0. 


(Keep in mind that 00---0 and 100- - -0 are bitstrings, not numbers: 


00---0= (eea); 100---0= (reno). 
— a —— 
n zeroes n—1 zeroes 
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[Proof of Claim 1: We induct on n. 
Induction base: A look at Qı reveals a hamp from 0 to 1. 
Induction step: Fix N > 2. We assume that Claim 1 holds for n = N — 1. In 
other words, Qny—1 has a hamp from 00---0 tol 00---0. Let p be sucha 
——— — 


N—1 zeroes N—2 zeroes 
hamp. 
By attaching a 0 to the front of each bitstring (= vertex) in p, we obtain a path 
q from 00---0 to 01 00---0 in Qy. 


N zeroes N-—2 zeroes 


By attaching a 1 to the front of each bitstring (= vertex) in p, we obtain a path 
rfrom1 00---0 to11 00---0 in Qy. 


N—1 zeroes N—2 zeroes : 
Now, we assemble a hamp from 00---0 to 1 00---0 in Qy as follows: 


N zeroes N-—1 zeroes 


e Start at 00---0, and follow the path q to its end (i.e., to 01 00---0 ). 


N zeroes N—2 zeroes 


e Then, move to the adjacent vertex 11 00---0. 


N-—2 zeroes 


e Then, follow the path r backwards, ending up at 1 00---0. 


N-—1 zeroes 


This shows that Claim 1 holds for n = N, too.] 

Claim 1 tells us that the n-hypercube Qq, has a hamp from 00: - -0 to 100- - -0. 
Since its starting point 00---0 and its ending point 100---0 are adjacent, we 
can turn this hamp into a hamc by appending the starting point 00 - - -0 again 
at the end. This proves Theorem [2.14.9 O 


2.14.5. Cartesian products 


Theorem/|2.14.9|can in fact be generalized. To state the generalization, we define 
the Cartesian product of two graphs: 


Definition 2.14.10. Let G = (V,E) and H = (W,F) be two simple graphs. 
The Cartesian product G x H of these two graphs is defined to be the simple 
graph (V x W, E’ UF’), where 


E' := {(v1,w) (vw) | 0102 € E and w € W} and 
F’ := {(v, w1) (v,w2) | wwz € F and v € V}. 


In other words, it is the graph whose vertices are pairs (v,w) € V x W 
consisting of a vertex of G and a vertex of H, and whose edges are of the 
forms 

(v1, w) (v2, wW) where viv? € E and w € W 
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and 
(v, w1) (v, w2) where wiw € F and v E€ V. 


For example, the Cartesian product G x P) of a simple graph G with the 
2-path graph Pz can be constructed by overlaying two copies of G and addi- 
tionally joining each vertex of the first copy to the corresponding vertex of the 
second copy by an edge. (The vertices of the first copy are the (v,1), whereas 
the vertices of the second copy are the (v,2).) For a specific example, here is 
the 5-cycle graph Cs and the Cartesian product C5 x P»: 


As another instance of the above description of G x P2, it is easy to see the 
following: 


Proposition 2.14.11. We have Qn = Qy_1 x Pz for each n > 1. (See Definition 
2.14.7|for the definitions of Q, and Qy_}.) 


Proof. This is Exercise 1 (a) on homework set #2 from my Spring 2017 course; 
see the course page for solutions. O 


Now, we claim the following: 


Theorem 2.14.12. Let G and H be two simple graphs. Assume that each of 
the two graphs G and H has a hamp. Then: 
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(a) The Cartesian product G x H has a hamp. 


(b) Now assume furthermore that at least one of the two numbers |V (G)| 
and |V (H)| is even, and that both numbers |V(G)| and |V (H)| are 
larger than 1. Then, the Cartesian product G x H has a hamc. 


Proof. This is Exercise 1 on homework set #2 from my Spring 2017 course 
(specifically, its parts (b) and (c)). Its solution can be found on 
(Specifically, see the solution to Exercise 1 on homework set #2 from Spring 
2017.) o 


Now, Theorem[2.14.9|can be reproved (again by inducting on n) using Theo- 
rem([2.14.12](b) and Proposition [2.14.11] since P) has a hamp and since |V (P2)| = 
2 is even. (Convince yourself that this works!) 


2.14.6. Subset graphs 


The n-hypercube Q, can be reinterpreted in terms of subsets of {1,2,...,n}. 
Namely: Letn € N. Let G, be the simple graph whose vertex set is the 
powerset P ({1,2,...,n}) of {1,2,...,n} (that is, the vertices are all 2” subsets 
of {1,2,...,n}), and whose edges are determined as follows: Two vertices S 
and T are adjacent if and only if one of the two sets S and T is obtained from 
the other by inserting an extra element (i.e., we have either S = T U {s} for 
some s ¢ T, or T = SU {t} for some t ¢ S). Then, Gn = Qn, since the map 


{0,1}" > P ({1,2,...,n}), 
(a1,a2,... an) > {i € {1,2,..., n} | ap= 1} 
is a graph isomorphism from Q, to Gy. 
Thus, Theorem |2.14.9}shows that for each n > 2, the graph Gn has a hamce. In 
other words, for each n > 2, we can list all subsets of {1,2,...,} in a circular 


list in such a way that each subset on this list is obtained from the previous one 
by inserting or removing a single element. For example, for n = 3, here is such 


a list: 
©, {1}, {1,2}, {2}, {23}, {1,2,3}, {1,3}, {3}. 
A long-standing question only resolved a few years ago asked whether the 


n+1 
same can be done with the subsets of {1,2,...,n} having size when n is 


odd. For example, for n = 3, we can do it as follows: 


0}, {L2}, {2}, {23}, {3}, {1,3}. 
In other words, if n > 3 is odd, and if G/, is the induced subgraph of G, on the 


—1 1 
set of all subsets J of {1,2,...,n} that satisfy |J| € f z“ “5 \, then does 


Gj, have a hamc? 
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Since Gn = Qn, we can restate this question equivalently as follows: If n > 3 
is odd, and if Q’, is the induced subgraph of Qn on the set 


k n—1 n+1 
{et E {0,1}" | aE { 7/9 |, 
i=1 


then does Q’, have a hame? 

In 2014, Torsten Miitze proved that the answer is “yes”. See for his 
truly nontrivial proof, and for a recent survey of similar questions. 
(CE also change ringing) 


The following exercise provides another generalization of Theorem [2.14.9 


Exercise 2.23. Let n and k be two integers such that n > k > 0. Define the 
simple graph Q,,; as follows: Its vertices are the bitstrings (a1, a2, ..., an) € 
{0,1}"; two such bitstrings are adjacent if and only if they differ in exactly k 
bits (in other words: two vertices (a1,42,...,4n) and (by, b,...,bn) are adja- 
cent if and only if the number of i € {1,2,...,n} satisfying a; € b; equals k). 
(Thus, Qn, is the n-hypercube graph Qn.) 


(a) Does Qn, have a hamc when k is even? (Recall that “hamc” is short for 
“Hamiltonian cycle”.) 


(b) Does Q, have a hamc when k is odd? 


[Hint: One way to approach part (b) is by identifying the set {0,1} with 
the field Fp with two elements. The bitstrings (a1,42,...,dn) € {0,1}" thus 
become the size-n row vectors in the F2-vector space F4. Let e1, e2, ..., €n be 
the standard basis vectors of F} (so that e; has a 1 in its i-th position and 
zeroes everywhere else). Then, two vectors are adjacent in the n-hypercube 
graph Q, (resp. in the graph Q,,;) if and only if their difference is one of the 
standard basis vectors (resp., a sum of k distinct standard basis vectors). Try 
to use this to find a graph isomorphism from Q, to a subgraph of Q,, ;.] 


The next exercise extends the idea of our proof of Theorem |2.14.9 


Exercise 2.24. Let n > 1. Let Qn be the n-hypercube graph, as in Definition 
2.14.7) Recall that “hamp” is short for “Hamiltonian path”. 

At what vertices can a hamp of Q, end if it starts at the vertex 00---0 ? 
(Find all possibilities, and prove that they are possible and all other vertices 
are impossible.) 
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3. Multigraphs 


3.1. Definitions 


So far, we have been working with simple graphs. We shall now introduce 
several other kinds of graphs, starting with the multigraphs. 


Definition 3.1.1. Let V be a set. Then, Pi (V) shall mean the set of all 
1-element or 2-element subsets of V. In other words, 


Pi2(V):={S CV | |S] € {1,2}} 
={{u,v} | u,v € V not necessarily distinct} . 


For instance, 
Piz ({12,3}) = {{1}; 125, (3h, th 2h, {L3}, {23}}. 
We can now define multigraphs: 


Definition 3.1.2. A multigraph is a triple (V, E, ọ), where V and E are two 
finite sets, and ọ : E + P12 (V) is a map. 


Example 3.1.3. Here is a multigraph: 


Formally speaking, this multigraph is the triple (V, E, p), where 
V = {1,2,3,4,5}, E = {i B, Y, ð, EK, À}, 


and where g : E — Pı2(V) is the map that sends «,ß,y,ô,£,K,À to 
{1,2}, {12,3}, {2,3}, 14,5}, {4,5}, {4,5}, {1}, respectively. (Of course, you 
can write {1} as {1,1}.) 


This suggests the following terminology (most of which is a calque of our 
previously defined terminology for simple graphs): 
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Definition 3.1.4. Let G = (V, E, pọ) be a multigraph. Then: 
(a) The elements of V are called the vertices of G. 
The set V is called the vertex set of G, and is denoted V (G). 


(b) The elements of E are called the edges of G. 
The set E is called the edge set of G, and is denoted E (G). 


(c) If e is an edge of G, then the elements of ọ (e) are called the endpoints 
of e. 


(d) We say that an edge e contains a vertex v if v € ọ (e) (in other words, if 
v is an endpoint of e). 


(e) Two vertices u and v are said to be adjacent if there exists an edge e € E 
whose endpoints are u and v. 


(£) Two edges e and f are said to be parallel if ọ (e) = ọ (f). (In the above 
example, any two of the edges 4, ¢,« are parallel.) 


(g) We say that G has no parallel edges if G has no two distinct edges that 
are parallel. 


(h) An edge e is called a loop (or self-loop) if ọ (e) is a 1-element set (ie., 
if e has only one endpoint). (In Example the edge A is a loop.) 


(i) We say that G is loopless if G has no loops (among its edges). 


(j) The degree deg v (also written degg v) of a vertex v of G is defined to 
be the number of edges that contain v, where loops are counted twice. 
In other words, 


degv = degg v 
:=|{e EE | ve g(e)}|+]{eE€E | ple) = {v}}I. 
— “a 
this counts all edges this counts all loops 
that contain v that contain v once again 


(Note that, unlike in the case of a simple graph, deg v is not the number 
of neighbors of v, unless it happens that v is not contained in any loops 
or parallel edges.) 


(For example, in Example we have deg1 = 3 and deg2 = 3 and 
deg 3 = 2 and deg 4 = 3 and deg5 = 3.) 


(k) A walk in G means a list of the form 


(vo, €1, V1, €2,02, +. - ; Ekr vk) (with k > 0) , 
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(1) 


(m) 


(n) 


(o) 


(p) 
(q) 


where vo, v1, ..., Up are vertices of G, where e1,€2,...,e, are edges of G, 
and where each i € {1,2,...,k} satisfies 


g (ei) = {vi-1 0i} 


(that is, the endpoints of each edge e; are vj_1 and v;). Note that we 
have to record both the vertices and the edges in our walk, since we 
want the walk to “know” which edges it traverses. (For instance, in 
Example B.1.3} the two walks (1, «,2, 6,3) and (1, «,2, y,3) are distinct.) 


The vertices of a walk (v9, e1, 01, €2, V2, . - -, €k, Uk) are Vo, V1,- -, Up; the 
edges of this walk are e1, e2,...,eķ. This walk is said to start at vp and 
end at vy; it is also said to be a walk from vo to vz. Its starting point is 
vo, and its ending point is v;. Its length is k. 


A path means a walk whose vertices are distinct. 


The notions of “path-connected” and “connected” and “component” 
are defined exactly as for simple graphs. The symbol ~ç still means 
“path-connected”. 


A closed walk (or circuit) means a walk (v9, €1, 01, €2, V2, . . -, €k, Uk) With 
Ok = Vo. 


A cycle means a closed walk (vo, €1, 01, €2, V2, . - - , €k, Vk) such that 


e the vertices vo, v1, ...,Uķ—1 are distinct; 
e the edges e1,é2,...,e, are distinct; 


e we have k > 1. 


(Note that we are not requiring k > 3 any more, as we did for sim- 
ple graphs. Thus, in Example 3.1.3] both (2,6,3, y,2) and (1,A,1) are 
cycles, but (2,8,3,f,2) is not. The purpose of the “k > 3” require- 
ment for cycles in simple graphs was to disallow closed walks such as 
(2, 6,3, 6,2) from being cycles; but they are now excluded by the “the 
edges e1,€2,...,e¢ are distinct” condition.) 


Hamiltonian paths and cycles are defined as for simple graphs. 


We draw a multigraph by drawing each vertex as a point, each edge as 
a curve, and labeling both the vertices and the edges (or not, if we don’t 
care about what they are). An example of such a drawing appeared in 


Example 


So there are two differences between simple graphs and multigraphs: 


1. A multigraph can have loops, whereas a simple graph cannot. 
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2. In a simple graph, an edge e is a set of two vertices, whereas in a multi- 
graph, an edge e has a set of two vertices (possibly two equal ones, if e is a 
loop) assigned to it by the map ¢. This not only allows for parallel edges, 
but also lets us store some information in the “identities” of the edges. 


Nevertheless, the two notions have much in common; thus, they are both 
called “graphs”: 


Convention 3.1.5. The word “graph” means either “simple graph” or “multi- 
graph”. The precise meaning should usually be understood from the context. 
(I will try not to use it when it could cause confusion.) 


Fortunately, simple graphs and multigraphs have many properties in com- 
mon, and often it is not hard to derive a result about multigraphs from the 
analogous result about simple graphs or vice versa. We will soon explore how 
some of the properties we have seen in the previous chapter can be adapted 
to multigraphs. First, however, let us explain how to convert multigraphs into 
simple graphs and vice versa. 


3.2. Conversions 


We can turn each multigraph into a simple graph, but at a cost of losing some 
information: 


Definition 3.2.1. Let G = (V,E,@) be a multigraph. Then, the underlying 
simple graph G*'™P of G means the simple graph 


(V, {p(e) | e € Eis nota loop}). 


In other words, it is the simple graph with vertex set V in which two distinct 
vertices u and v are adjacent if and only if u and v are adjacent in G. Thus, 
G*™P is obtained from G by removing loops and “collapsing” parallel edges 
to a single edge. 


For example, the underlying simple graph of the multigraph G in Example 
is 


| 


Conversely, each simple graph can be viewed as a multigraph: 
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Definition 3.2.2. Let G = (V, E) bea simple graph. Then, the corresponding 
multigraph G™" is defined to be the multigraph 


(V, E, L) 7 
where ı : E —> P12 (V) is the map sending each e € E to e itself. 


Example 3.2.3. If 


then 


{1,3} 


As we said, the “underlying simple graph” construction G ++ G*'™P destroys 
information, so it is irreversible. This being said, the two constructions G +> 
Gimp and G++ G™t come fairly close to undoing one another 


Proposition 3.2.4. 


simp 


(a) If Gis a simple graph, then (G!) G. 


(b) If G is a loopless multigraph that has no parallel edges, then 
(Gsimp) mlt = G. (This is just an isomorphism, not an equality, since the 
“identities” of the edges of G have been forgotten in G*'"P and cannot 
be recovered.) 


(c) If G is a multigraph that has loops or (distinct) parallel edges, then 


the multigraph (Gsimp) mt has fewer edges than G and thus is not 
isomorphic to G. 


15Tn the following proposition, we will use the notion of an “isomorphism of multigraphs”. A 
rigorous definition of this notion is given in Definition 3.3.4] further below (but it is more 
or less what you would expect: it is a way to relabel the vertices and the edges of one 
multigraph to obtain those of another). 
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Proof. A matter of understanding the definitions. O 


We will often identify a simple graph G with the corresponding multigraph 
Gmult. This may be dangerous, because we have defined notions such as ad- 
jacency, walks, paths, cycles, etc. both for simple graphs and for multigraphs; 
thus, when we identify a simple graph G with the multigraph Gmult we are 
potentially inviting ambiguity (for example, does “cycle of G” mean a cycle of 
the simple graph G or of the multigraph G™" ?). Fortunately, this ambiguity is 
harmless, because whenever G is a simple graph, any of the notions we defined 
for G is equivalent to the corresponding notion for the multigraph G™™'*. For 
example, for the notions of a cycle, we have the following: 


Proposition 3.2.5. Let G be a simple graph. Then: 


(a) If (v0,e1,01, €2,V2,..-,€k, Vk) is a cycle of the multigraph Gmult then 
(vo, v1,- -, Vk) is a cycle of the simple graph G. 


(b) Conversely, if (vo, v1,...,Vķ) is a cycle of the simple graph G, then 
(vo, {00,01} , 01, {01,02}, V2, . --, Uk—-1,{Ux_1, Ue}, UK) is a cycle of the 
multigraph Get, 


Proof. This is not completely obvious, since our definitions of a cycle of a simple 
graph and of a cycle of a multigraph were somewhat different. The proof boils 
down to checking the following two statements: 


1. If (vo, 01,..., Vk) is a cycle of the simple graph G, then its edges 
{vo, v1}, {01 V2}, ..., {U0k—1, Vk} are distinct. 


2. If (Vo, €1, 01, €2,02,- . ., €k, Vk) is a cycle of the multigraph Gmult then k > 3. 


Checking statement 2 is easy (we cannot have k = 1 since G™" has no loops, 
and we cannot have k = 2 since this would lead to e} = e2). Statement 1 
is also clear, since the distinctness of the k vertices vo, V1,...,Ux—1 forces the 2- 
element sets formed from these k vertices to also be distinct (and since the edges 
{00,01}, {01,02} ,...,{0x_1, Vk} = {0~_1, Vo} are such 2-element sets). o 


For all other notions discussed above, it is even more obvious that there is no 
ambiguity. 
3.3. Generalizing from simple graphs to multigraphs 


Now, as promised, we shall revisit the results of Chapter |2, and see which of 
them also hold for multigraphs instead of simple graphs. 
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3.3.1. The Ramsey number R (3,3) 


One of the first properties of simple graphs that we proved is the following 
(Proposition |2.3.1): 


Proposition 3.3.1. Let G be a simple graph with |V (G)| > 6 (that is, G has at 
least 6 vertices). Then, at least one of the following two statements holds: 


e Statement 1: There exist three distinct vertices a, b and c of G such that 
ab, bc and ca are edges of G. 


e Statement 2: There exist three distinct vertices a, b and c of G such that 
none of ab, bc and ca is an edge of G. 


This is still true for multigraphd!4, because replacing a multigraph G by the 
underlying simple graph G*'"P does not change the meaning of the statement. 


3.3.2. Degrees 


In Definition |2.4.1} we defined the degree of a vertex v in a simple graph G = 
(V, E) by 


degv := (the number of edges e € E that contain v) 
= (the number of neighbors of v) 
=|{ueV | uve E}| 
=|{ec€E | eee}; 


These equalities no longer hold when G is a multigraph. Parallel edges corre- 
spond to the same neighbor, so the number of neighbors of v is only a lower 
bound on deg v. 


Proposition [2.4.2] (which says that if G is a simple graph with n vertices, then 
any vertex v of G has degree degv € {0,1,...,n—1}) also no longer holds 
for multigraphs, because you can have arbitrarily many edges in a multigraph 
with just 1 or 2 vertices. (You can even have parallel loops!) 


Is Proposition true for multigraphs? Yes, because we have said that 
loops should count twice in the definition of the degree. The proof needs some 
tweaking, though. Let me give a slightly different proof; but first, let me state 
the claim for multigraphs as a proposition of its own: 


16Of course, we should understand it appropriately: i.e., we should read “ab is an edge” as 
“there is an edge with endpoints a and b”. 
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Proposition 3.3.2 (Euler 1736 for multigraphs). Let G be a multigraph. Then, 
the sum of the degrees of all vertices of G equals twice the number of edges 


of G. In other words, 
ð}. degv =2-|E(G)|. 
vEV(G) 


Proof. Write G as G = (V,E, 9); thus, V (G) = V and E (G) = E. 

For each edge e, let us (arbitrarily) choose one endpoint of e and denote it 
by a(e). The other endpoint will be called £ (e). If e is a loop, then we set 
p (e) = a (e). Then, for each vertex v, we have 


deg v = (the number of e € E such that v = a (e)) 
+ (the number of e € E such that v = £ (e)) 


(note how loops get counted twice on the right hand side, because if e € E isa 
loop, then v is both « (e) and £ (e) at the same time). Summing up this equality 
over all v € V, we obtain 


) degv = )° (the number of e € E such that v = «a (e)) 
vEeV vEeV 
+ }_ (the number of e € E such that v = £ (e)). 
vEV 
However, 


)_ (the number of e € E such that v = a (e)) = |E], 
vEV 


since each edge e € E is counted in exactly one addend of this sum. Similarly, 
)_ (the number of e € E such that v = £ (e)) = |E|. 
vEV 

Thus, the above equality becomes 


J_ degv = }_ (the number of e € E such that v = a (e)) 


vEV vEV 
ns 
=|E| 


+ }_ (the number of e € E such that v = £ (e)) 


vEV 
S 
=|E| 


= |E| + [E| = 2- |E]. 
This proves Proposition [3.3.2 O 


This is a good motivation for counting loops twice in the definition of a 
degree. 


The handshake lemma (Corollary |2.4.4) still holds for multigraphs. In other 
words, we have the following: 
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Corollary 3.3.3 (handshake lemma). Let G be a multigraph. Then, the num- 
ber of vertices v of G whose degree deg v is odd is even. 


Proof. This follows from Proposition in the same way as for simple graphs. 
oO 


Proposition fails for multigraphs. For example, the multigraph 


has three vertices with degrees 1,2,3. Fortunately, 
Proposition |[2.4.5|was more of a curiosity than a useful fact. 


Mantel’s theorem (Theorem 2.4.6) also fails for multigraphs, because we can 
join two vertices with a lot of parallel edges and thus satisfy e > n?/4 for stupid 
reasons without ever creating a triangle. Thus, Turan’s theorem (Theorem|2.4.8) 
also fails for multigraphs. 


3.3.3. Graph isomorphisms 


Graph isomorphy (and isomorphisms) can still be defined for multigraphs, but 
the definition is not the same as for simple graphs. Graph isomorphisms can 
no longer be defined merely as bijections between the vertex sets, since we also 
need to specify what they do to the edges. Instead, we define them as follows: 


Definition 3.3.4. Let G = (V,E,g) and H = (W,F, yp) be two multigraphs. 


(a) A graph isomorphism (or isomorphism) from G to H means a pair 
(x, B) of bijections 


&«:V >W and B:E>F 


with the property that if e € E, then the endpoints of f (e) are the im- 
ages under « of the endpoints of e. (This property can also be restated 
as a commutative diagram 


E F 
o| | 


P 


—— 


where P (a) is the map from P12 (V) to P12 (W) that sends each sub- 
set {u,v} € Pi2(V) to {a (u), a (v)} € P12 (W). If you are used to 
category theory, this restatement may look more natural to you.) 


(b 


æ 


We say that G and H are isomorphic (this is written G = H) if there 
exists a graph isomorphism from G to H. 
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Again, isomorphy of multigraphs is an equivalence relation. 


3.3.4. Complete graphs, paths, cycles 


In Definition 2.6.1} Definition [2.6.2]and Definition [2.6.3] we defined the complete 
graphs K,,, the path graphs P,, and the cycle graphs C, as simple graphs. Thus, 
all of them can be viewed as multigraphs if one so desires (since each simple 
graph G gives rise to a multigraph G™™"), 

However, using multigraphs, we can extend our definition of n-th cycle 
graphs C, to the case n = 1 and also tweak it in the case n = 2 to make it 
more natural. We do this as follows: 


Definition 3.3.5. We modify the definition of cycle graphs (Definition |2.6.3) 
as follows: 


(a) We redefine the 2-nd cycle graph C2 to be the multigraph with two 
vertices 1 and 2 and two parallel edges with endpoints 1 and 2. (We 
don’t care what the edges are, only that there are two of them and each 


has endpoints 1 and 2.) Thus, it looks as follows: Oe 2) : 


(b) We define the 1-st cycle graph C; to be the multigraph with one vertex 
1 and one edge (which is necessarily a loop). Thus, it looks as follows: 


This has the effect that the n-th cycle graph C, has exactly n edges for each 
n > 1 (rather than having 1 edge for n = 2, as it did back when it was a simple 


graph). 


3.3.5. Induced submultigraphs 


In Definition we defined subgraphs and induced subgraphs of a simple 
graph. The corresponding notions for multigraphs are defined as follows: 


Definition 3.3.6. Let G = (V, E, @) be a multigraph. 


(a) A submultigraph of G means a multigraph of the form H = (W,F, 4), 
where W C V and F C E and = ọ |r. In other words, a submulti- 
graph of G means a multigraph H whose vertices are vertices of G and 
whose edges are edges of G and whose edges have the same endpoints 
in H as they do in G. 


We often abbreviate “submultigraph” as “subgraph”. 
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(b) Let S be a subset of V. The induced submultigraph of G on the set S 
denotes the submultigraph 


(S, E', p ler) 
of G, where 
F' := {fe € E | all endpoints of e belong to S}. 


In other words, it denotes the submultigraph of G whose vertices are 
the elements of S, and whose edges are precisely those edges of G 
whose both endpoints belong to S. We denote this induced submullti- 
graph by G [S]. 


(c) An induced submultigraph of G means a submultigraph of G that is 
the induced submultigraph of G on S for some S C V. 


The infix “multi” is often omitted. So we often speak of “subgraphs” 
instead of “submultigraphs”. 


With these definitions, we can now identify cycles in a multigraph with sub- 
graphs isomorphic to a cycle graph: A cycle of length n in a multigraph G is 
“the same as” a submultigraph of G isomorphic to C,. (We leave the details to 
the reader.) 


3.3.6. Disjoint unions 


In Section [2.8] we defined the disjoint union of two or more simple graphs. The 
analogous definition for multigraphs is straightforward and left to the reader. 


3.3.7. Walks 


We already defined walks, paths, closed walks and cycles for multigraphs back 
in Section The length of a walk is still defined to be its number of edges. 
Now, let’s see which of their basic properties (seen in Section [2.9) still hold for 
multigraphs. 

First of all, the edges of a path are still always distinct. This is just as easy to 
prove as for simple graphs. 

Next, let us see how two walks can be “spliced” together: 


Proposition 3.3.7. Let G be a multigraph. Let u, v and w be three ver- 
tices of G. Let a = (a,e1,41,...,e;,4,) be a walk from u to v. Let 
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b = (bo, fi,01,..-, fe, bo) be a walk from v to w. Then, 


(ao, 61, 41,. : -s €k Ak, fis by, fa, b2, . . pI ey by) 
= (ao, £1, 41,- - -s Ak—17 kr bo, fi, by, oe safir bo) 
= (ao, €1,41,- . tte V, fis b1,. : ., febo) 


is a walk from u to w. This walk shall be denoted a * b. 


Walks can be reversed (i.e., walked in backwards direction): 


Proposition 3.3.8. Let G be a multigraph. Let u and v be two vertices of G. 
Let a = (a0, 61, 41, . . - , €k, ax) be a walk from u to v. Then: 


(a) The list (ap, ex, 4p_1,€k—-1,---,€1,40) is a walk from v to u. We denote 
this walk by rev a and call it the reversal of a. 


(b) If a is a path, then rev a is a path again. 


Walks that are not paths contain smaller walks between the same vertices: 


Proposition 3.3.9. Let G be a multigraph. Let u and v be two vertices of G. 
Let a = (d,e1,41,...,€,4,) be a walk from u to v. Assume that a is not a 
path. Then, there exists a walk from u to v whose length is smaller than k. 


Corollary 3.3.10 (When there is a walk, there is a path). Let G be a multi- 
graph. Let u and v be two vertices of G. Assume that there is a walk from u 
to v of length k for some k € IN. Then, there is a path from u to v of length 
<k. 


All these results can be proved in the same way as their counterparts for 
simple graphs; the only change needed is to record the edges in the walk. 


Given a multigraph G and two vertices u and v of G, we can ask ourselves 
the same five Questions 1, 2, 3, 4 and 5 that we asked for a simple graph G 
in Subsection The answers we gave in that subsection still apply without 
requiring substantial changes; the only necessary modification is that we now 
have to keep track of the edges in a path or walk. (The reader can easily fill in 
the details here.) 


3.3.8. Path-connectedness 


The relation “path-connected” is defined for multigraphs just as it is for simple 
graphs (Definition [2.9.8), and is still denoted ~ç. It is still an equivalence 
relation (and the proof is the same as for simple graphs). The following also 
holds (with the same proof as for simple graphs): 
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Proposition 3.3.11. Let G be a multigraph. Let u and v be two vertices of G. 
Then, u ~ç v if and only if there exists a path from u to v. 


The definitions of “components” and “connected” for multigraphs are the 
same as for simple graphs (Definition 2.9.1 1]and Definition 2.9.12). The follow- 
ing propositions can be proved in the same way as we proved their analogues 
for simple graphs: 


Proposition 3.3.12. Let G be a multigraph. Let C be a component of G. 
| Then, the induced subgraph (= submultigraph) of G on the set C is con- 
nected. 
Proposition 3.3.13. Let G be a multigraph. Let C1, C2, . . ., Ck be all compo- 
nents of G (listed without repetition). 
Thus, G is isomorphic to the disjoint union G [C1] U G [C2] U--- U G [Cy]. 


The following proposition is an analogue of Proposition [2.10.4] for multi- 
graphs: 


Proposition 3.3.14. Let G be a multigraph. Let w be a walk of G such that no 
two adjacent edges of w are identical. (By “adjacent edges”, we mean edges 
of the form e;_; and e;, where e1,é,...,e, are the edges of w from first to 
last.) 

Then, w either is a path or contains a cycle (i.e., there exists a cycle of G 
whose edges are edges of w). 


Proof. The proof of this proposition for multigraphs is more or less the same as 
it was for simple graphs (i.e., as the proof of Proposition |2.10.4), with a mild 
difference in how we prove that the walk (w;, wj+1,...,;) is a cycle (of course, 
this walk is no longer (Wj, Wi+1, - - -, Wj) now, but rather (wi, 41, Wi41,-- -€j Wj), 
because the edges need to be included)! oO 


17Here are some details: 

We assume that w is not a path, and we write the walk w as (wọ, €1, W1, €2, W2, . - - , Wk, ek). 
Then, there exists a pair (i,j) of integers i and j with i < j and w; = wj. Among all such 
pairs, we pick one with minimum difference j — i. Then, (Wj, €i41, Witir- --, ej, wj) is a closed 
walk. We claim that this closed walk is a cycle. 

To do so, we need to show that 


1. the vertices w;, wj+1,...,Wj—1 are distinct; 
2. the edges ej+1, €i42,.- .,ej are distinct; 
3. we have j—i > 1. 
The first of these claims follows from the minimality of j — i. The third follows from i < j. 


It remains to prove the second claim. In other words, it remains to prove that the edges 
ei+1 i42- - -, j are distinct, i.e., that we have e, # e, for any two integers a and b satisfying 
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Just as for simple graphs, we get the following corollary: 


Corollary 3.3.15. Let G be a multigraph. Assume that G has a closed walk 
w of length > 0 such that no two adjacent edges of w are identical. Then, G 
has a cycle. 


The analogue of Theorem |2.10.7|for multigraphs is true as well: 


Theorem 3.3.16. Let G be a multigraph. Let u and v be two vertices in G. 
Assume that there are two distinct paths from u to v. Then, G has a cycle. 


Proof. For simple graphs, this was proved as Theorem [2.10.7] above. The same 
proof applies to multigraphs, once the obvious changes are made (e.g., instead 
of Pa—1Pa and qp—19p, we need to take the last edges of the two walks p and 


q). O 


In contrast, Proposition /2.11.1)is false for multigraphs. In fact, we can take 
a multigraph with a single vertex and lots of loops around it. In that case, its 
degree can be very large, but it has no cycles of length > 1. 


3.3.9. G\e, bridges and cut-edges 
Next, we extend the definition of G \ e (Definition 2.12.1) to multigraphs: 


Definition 3.3.17. Let G = (V,E,@) be a multigraph. Let e be an edge of G. 
Then, G \ e will mean the graph obtained from G by removing this edge e. 
In other words, 


G \e:= (v, E\ {e}, @ EVO 


i<ax< b< j. Let us do this. Let a and b be two integers satisfying i < a < b < j. We must 
show that ea # ep. We distinguish two cases: the case a = b — 1 and the case a Æ b — 1. 


e Ifa = b—1, then e, and e, are two adjacent edges of w and thus distinct (since we 
assumed that no two adjacent edges of w are identical). Thus, e, # ep is proved in 
the case when a = b — 1. 


e Now, consider the case when a Æ b — 1. In this case, we must have a < b — 1 (since 
a < b entails a < b — 1). Also, i < a — 1 (since i < a). Hence,ix<a-1l<a<b-1< 
j— 1 (since b < j). Therefore, b — 1, a — 1 and a are three distinct elements of the 
set {ii +1,...,j— 1}. Consequently, Wp—1, Wa—1, Wa are three distinct vertices (since 
the vertices wj,Wj41,...,Wj-1 are distinct). Therefore, w,_1 É {Wa-1,Wa} = ọ (ea) 
(since w is a walk, so that the edge e, has endpoints w,_; and wa). However, ¢ (ep) = 
{wp—1, Wp} (since w is a walk, so that the edge ep has endpoints w,_; and wp). Now, 
comparing Wp—1 € {Wp—1, Wp} = Q (ep) with wp—1 É @ (ea), we see that the sets ¢ (ep) 
and ọ (ea) must be distinct (since ọ (ep) contains wp—1 but ọ (ea) does not). In other 
words, ọ (ep) # ọ (ea). Hence, ep Æ eg. In other words, eg # ep. Thus, ea # ep is 
proved in the case when a Æ b — 1. 


We have now proved e, Æ ep in both cases, so we are done. 
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Some authors write G — e for G \ e. 
The analogue of Theorem[2.12.2|for multigraphs holds (and can be proved in 
the same way as Theorem 2.12.2): 


Theorem 3.3.18. Let G be a multigraph. Let e be an edge of G. Then: 


(a) If e is an edge of some cycle of G, then the components of G \ e are 
precisely the components of G. (Keep in mind that the components are 
sets of vertices. It is these sets that we are talking about here, not the 
induced subgraphs on these sets.) 


(b) If e appears in no cycle of G (in other words, there exists no cycle of G 
such that e is an edge of this cycle), then the graph G \ e has one more 
component than G. 


Note that an edge e that is a loop always is an edge of a cycle (indeed, it 
creates a cycle of length 1), and can never appear on any path; thus, removing 
such an edge e obviously does not change the path-connectedness relation. 

Defining cut-edges and bridges just as we did for simple graphs (Definition 
2.12.4), we equally recover the following corollary: 


Corollary 3.3.19. Let e be an edge of a multigraph G. Then, e is a bridge if 
and only if e is a cut-edge. 


Proof. Just like the proof of Corollary [2.12.5 o 


3.3.10. Dominating sets 


We defined and studied dominating sets in Section [2.13] We could define domi- 
nating sets for multigraphs in the same way as for simple graphs, but we would 
not get anything new this way. Indeed, if G is a multigraph, then the dominat- 
ing sets of G are precisely the dominating sets of GS'™P. Thus, we can reduce 
any claims about dominating sets of multigraphs to analogous claims about 
simple graphs. 


3.3.11. Hamiltonian paths and cycles 


As we said before, a multigraph G has a Hamiltonian path or Hamiltonian 
cycle if and only if the corresponding simple graph G*'™P has one. This does 
not mean, however, that everything we proved about Hamiltonian paths still 
applies to multigraphs. For instance, neither Ore’s theorem (Theorem [2.14.4) 
nor Dirac’s theorem (Corollary 2.14.5) holds for multigraphs, because we could 
duplicate edges to make degrees arbitrarily large, without necessarily creating 
a hamce. 

Proposition |2.14.6] still holds for multigraphs, but this is clear because it can 
be derived from the corresponding property of G*™P. 
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3.3.12. Exercises 


Exercise 3.1. Which of the Exercises [2.3] [2.4] [2.7] [2.14] [2.15] [2.5]and [2.8]remain 
true if “simple graph” is replaced by “multigraph”? 

(For each exercise that becomes false, provide a counterexample. For each 
exercise that remains true, either provide a new solution that works for multi- 
graphs, or argue that the solution we have seen applies verbatim to multi- 
graphs, or derive the multigraph case from the simple graph case.) 


Exercise 3.2. Let G be a multigraph with at least one edge. Assume that each 
vertex of G has even degree. Prove that G has a cycle. 


[Solution: This is Exercise 4 on midterm #1 from my Spring 2017 course; 


see the course page for solutions. ] 


Exercise 3.3. Let G be a multigraph. Let d > 2 be an integer. Assume that 
degv > 2 for each vertex v of G. Prove that G has a cycle whose length is 
not divisible by d. 


Exercise 3.4. Let G be a multigraph. Assume that G has exactly two vertices 
of odd degree. Prove that these two vertices are path-connected. 


Exercise 3.5. Let G = (V, E, ọ) be a multigraph that has no loops. 

If e € E is an edge that contains a vertex v € V, then we let e/v denote the 
endpoint of e distinct from v. (If e is a loop, then this is understood to mean 
v itself.) 

For each v € V, we define a rational number qy by 


E deg (e/v) 
k 2 degv ` 
vEg(e) 


(Note that the denominator deg v on the right hand side is nonzero whenever 
the sum is nonempty!) 

(Thus, qv is the average degree of the neighbors of v, weighted with the 
number of edges that join v to the respective neighbors. If v has no neighbors, 
then qv = 0.) 

Prove that 

$ qv = }_ degv. 
vEV vEV 

(In other words, in a social network, your average friend has, on average, 
more friends than you do!) 


[Hint: Any positive reals x and y satisfy + 2 > 2. Why, and how does 
this help?] 
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Exercise 3.6. Let F be any field. (For instance, F can be Q or R or C.) 

Let G = (V,E,@) be a multigraph, where V = {1,2,...,n} for some n € 
N. 

For each edge e € E, we construct a column vector xe € F” (that is, a 
column vector with n entries) as follows: 


e Ife is a loop, then we let Xe be the zero vector. 


e Otherwise, we let u and v be the two endpoints of e, and we let Xe be the 
column vector that has a 1 in its u-th position, a —1 in its v-th position, 
and Os in all other positions. (This depends on which endpoint we call 
u and which endpoint we call v, but we just make some choice and 
stick with it. The result will be true no matter how we choose.) 


Let M be the n x |E|-matrix over F whose columns are the column vectors 
Xe for all e € E (we order them in some way; the exact order doesn’t matter). 
Prove that 

rank M = |V| — conn G, 
where conn G denotes the number of components of G. 


[Example: Here is an example: Let G be the multigraph 


(so that n = 5). Then, if we choose the endpoints of b to be 2 and 5 in this 


0 
1 
order, then we have x, = | 0 |. (Choosing them to be 5 and 2 instead, we 
0 
—1 
0 
—1 
would obtain x, = | 0 |.) If we do the same for all edges of G (that is, 
0 
1 


we choose the smaller endpoint as u and the larger endpoint as v), and if we 
order the columns so that they correspond to the edges a, b, c,d,e, f, g, h from 
left to right, then the matrix M comes out as follows: 


1 0 1 0 0 0 0 
—1 1 0 0 1 0 0 
M=| 0 0 0 1 -1 0 1 
0 0 0 0 O 1 -1 


Oooo 
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It is easy to see that rank M = 4, which is precisely |V| — conn G.] 


[Remark: The claim of the exercise can be restated as follows: The span of 
the vectors xe for all e € E has dimension |V| — conn G. 

Topologists will recognize the matrix M as (a matrix that represents) the 
boundary operator ð : Cı (G) — Co(G), where G is viewed as a CW- 
complex.] 


Exercise 3.7. If G is a multigraph, then conn G shall denote the number of 
connected components of G. (Note that this is 0 when G has no vertices, and 
1 if G is connected.) 

Let (V, H, pọ) be a multigraph. Let E and F be two subsets of H. 


(a) Prove that 


conn (V, E, ọ |g) +conn(V, F, ọ |F) 
< conn (V, EUF, ọ |gur)+conn(V, ENF, ọ |enF)- (2) 


[Hint: Feel free to restrict yourself to the case of a simple graph; in this 
case, E and F are two subsets of P» (V), and you have to show that 


conn (V, E) + conn (V, F) < conn (V, EUF) +conn (V, ENF). 


This isn’t any easier than the general case, but saves you the hassle of 
carrying the map ọ around.] 


(b) Give an example where the inequality (2) does not become an equality. 


[Solution: This is Exercise 3 on homework set #3 from my Spring 2017 


course; see the course page for solutions.] 


Exercise 3.8. Let G = (V,E,@) be a connected multigraph with 2m edges, 
where m € N. A set {e, f} of two distinct edges will be called a friendly 
couple if e and f have at least one endpoint in common. Prove that the 
edge set of G can be decomposed into m disjoint friendly couples (i.e., there 
exist m disjoint friendly couples {e1, fi}, {e2, fo},...,{em, fm} such that E = 
{e1, fi, €2, f2, --- €m, fm}). (“Disjoint” means “disjoint as sets” — i.e., having 
no edges in common.) 
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[Example: Here is a graph with an even number of edges: 


One possible decomposition of its edge set into disjoint friendly couples is 
{ay}, {b,z}, {0 XF-] 

[Hint: Induct on |E|. Pick a vertex v of degree > 1 and consider the 
components of G \v.] 


Exercise 3.9. Let n > 0. Let d4, d2, ..., dn be n nonnegative integers such that 
dı + d2 +- +d, is even. 


(a) Prove that there exists a multigraph G with vertex set {1,2,...,n} such 
that all i € {1,2,...,n} satisfy degi = dj. 


(b) Prove that there exists a loopless multigraph G with vertex set 
{1,2,...,n} such that alli € {1,2,...,n} satisfy degi = d; if and only 
if each į € {1,2,...,n} satisfies the inequality 


y dzi (3) 
jE{1,2,.. 1n}; 
j#i 


[Remark: The inequality (3) is the “n-gon inequality”: It is equivalent to 
the existence of a (possibly degenerate) n-gon with sidelengths d1,d2,...,dn.] 


Exercise 3.10. Let G be a loopless multigraph. Recall that a trail (in G) means 
a walk whose edges are distinct (but whose vertices are not necessarily dis- 
tinct). Let u and v be two vertices of G. As usual, “trail from u to v” means 
“trail that starts at u and ends at v”. Prove that 


(the number of trails from u to v in G) 
= (the number of paths from u to v in G) mod 2. 


[Hint: Try to pair up the non-path trails into pairs. Make sure to prove that 
this pairing is well-defined (i.e., each non-path trail t has exactly one partner, 
which is not itself, and that t is the designated partner of its partner!).] 
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Exercise 3.11. Let G be a multigraph such that every vertex of G has even 
degree. Let u and v be two distinct vertices of G. Prove that the number of 
paths from u to v is even. 


[Hint: When you add an edge joining u to v, the graph G becomes a graph 
with exactly two odd-degree vertices u and v, and the claim becomes “the 
number of paths from u to v is odd” (why?). In this form, the claim turns 
out to be easier to prove. Indeed, any path must start with some edge... 

Keep in mind that paths can be replaced by trails, by Exercise |3.10}] 


Exercise 3.12. Let G = (V,E,@) be a multigraph such that |E| > |V|. Prove 
2n +2 


that G has a cycle of length < , where n = |V]. 


[Solution: This is Exercise 8 on midterm #3 from my Spring 2017 course 
(except that the simple graph was replaced by a multigraph); see the course 
page for solutions.] 


3.4. Eulerian circuits and walks 
3.4.1. Definitions 


Let us now move on to a new feature of multigraphs, one that we have not yet 
studied (even for simple graphs). 

Recall that a Hamiltonian path or cycle is a path or cycle that contains all 
vertices of the graph. Being a path or cycle, it has to contain each of them 
exactly once (except, in the case of a cycle, of its starting point). 

What about a walk or closed walk that contains all edges exactly once in- 
stead? These are called “Eulerian” walks or circuits; here is the formal defini- 
tion: 


Definition 3.4.1. Let G be a multigraph. 


(a) A walk of G is said to be Eulerian if each edge of G appears exactly 
once in this walk. 


(In other words: A walk (vo, €1, v1, €2,02,- .., €k, Uk) Of G is said to be 
Eulerian if for each edge e of G, there exists exactly one i € {1,2,...,k} 
such that e = @;.) 


(b) An Eulerian circuit of G means a circuit (i.e., closed walk) of G that is 
Eulerian. (Strictly speaking, the preceding sentence is redundant, but 
we still said it to stress the notion of an Eulerian circuit.) 


Unlike for Hamiltonian paths and cycles, an Eulerian walk or circuit is usu- 
ally not a path or cycle. Also, finding an Eulerian walk in a multigraph G is not 
the same as finding an Eulerian walk in the simple graph G*'™P. (Nevertheless, 
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some authors call Eulerian walks “Eulerian paths” and call Eulerian circuits 
“Eulerian cycles”. This is rather confusing.) 


Example 3.4.2. Consider the following multigraphs: 


Eulerian walk 
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(3,d,5,b,2,e,3,2,4, f,5,c,1,a,2). But A has no Eulerian circuit. 
The easiest way to see this is by observing that A has a vertex of odd 
degree (e.g., the vertex 2). If an Eulerian circuit were to exist, then it 
would have to enter this vertex as often as it exited it; but this would 
mean that the degree of this vertex would be even (because each edge 
containing this vertex would be used exactly once either to enter or 
to exit it, except for loops, which would be used twice). So, more 
generally, any multigraph that has a vertex of odd degree cannot have 
an Eulerian circuit. 


e The multigraph B has an Eulerian circuit (1,4,2,b,3,c,4,d,1), and thus 
of course an Eulerian walk (since any Eulerian circuit is an Eulerian 
walk). 


e The multigraph C has an Eulerian circuit 
(1,g,1,b,2,c,3,d,2,e,4, f,2,a,1). 


e The multigraph D has no Eulerian walk. Indeed, it has four vertices of 
odd degree. If v is a vertex of odd degree, then any Eulerian walk has to 
either start or end at v (since otherwise, the walk would enter and leave 
v equally often, but then the degree of v would be even). But a walk 
can only have one starting point and one ending point. This allows for 
two vertices of odd degree, but not more than two. So, more generally, 
any multigraph that has more than two vertices of odd degree cannot 
have an Eulerian walk. 


e The multigraph E has no Eulerian walk. The reason is the same as for 
D. Note that E is the famous multigraph of bridges in Königsberg, as 
studied by Euler in 1736 (see the Wikipedia page for “Seven bridges of 
Königsberg” for the backstory). 


e The multigraph F has no Eulerian walk, since it has two components, 
each containing at least one edge. (An Eulerian walk would have to 
contain both edges b and c, but there is no way to walk between them, 
since they belong to different components.) 


e The multigraph G has an Eulerian walk, namely 
(3,b,2,h,5,g,1,a,2, f,4,d, 1,e,3,c,4). It has no Eulerian circuit, 
since it has two vertices of odd degree. 


¢ The multigraph H has an Eulerian circuit, namely (1). 


Remark 3.4.3. For the pedants: A multigraph can have an Eulerian circuit 
even if it is not connected, as long as all its edges belong to the same compo- 
nent (i.e., all but one components are just singletons with no edges). Here is 
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an example: 


Exercise 3.13. Let n be a positive integer. Recall from Definition [2.6.1] (a) that 
K, denotes the complete graph on n vertices. This is the graph with vertex 
set V = {1,2,...,n} and edge set P2 (V) (so each two distinct vertices are 
adjacent). 

Find Eulerian circuits for the graphs K3, Ks, and K7. 


[Solution: This is Exercise 2 on homework set #2 from my Spring 2017 


course; see the course page for solutions.] 


3.4.2. The Euler—Hierholzer theorem 


How hard is it to find an Eulerian walk or circuit in a multigraph, or to check 
if there is any? Surprisingly, this is a lot easier than the same questions for 
Hamiltonian paths or cycles. The second question in particular is answered 
(for connected multigraphs) by the Euler-Hierholzer theorem: 


Theorem 3.4.4 (Euler, Hierholzer). Let G be a connected multigraph. Then: 


(a) The multigraph G has an Eulerian circuit if and only if each vertex of 
G has even degree. 


(b) The multigraph G has an Eulerian walk if and only if all but at most 
two vertices of G have even degree. 


We already proved the “=>” directions of both parts (a) and (b) in Example 
It remains to prove the “<—” directions. I don’t think that Euler actually 
proved them in his 1736 paper, but Hierholzer did in 1873. The “standard” 
proof can be found in many texts, such as Theorem 5.2.2 and The- 
orem 5.2.3]. I will sketch a different proof, which I learnt from 
Problem 12.35]. We begin with the following definition: 


Definition 3.4.5. Let G be a multigraph. A trail of G means a walk of G 
whose edges are distinct. 


So a trail can repeat vertices, but cannot repeat edges. 
Thus, an Eulerian walk has to be a trail. A trail cannot be any longer than 
an Eulerian walk. So a reasonable way to try constructing an Eulerian walk 
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is to start with some trail, and make it progressively longer until it becomes 
Eulerian (hopefully). 

This suggests the following approach to proving the “<—” directions of The- 
orem We pick the longest trail of G and argue that (under the right 
assumptions) it has to be Eulerian, since otherwise there would be a way to 
make it longer. Of course, we need to find such a way. Here is the first step: 


Lemma 3.4.6. Let G be a multigraph with at least one vertex. Then, G has a 
longest trail. 


Proof. Clearly, G has at least one trail (e.g., a length-0 trail from a vertex to 
itself). Moreover, G has only finitely many trails (since each edge of G can only 
be used once in a trail, and there are only finitely many edges). Hence, the 
maximum principle proves the lemma. O 


Our goal now is to show that under appropriate conditions, such a longest 
trail will be Eulerian. This will require two further lemmas. 

First, one more piece of notation: We say that an edge e of a multigraph G 
intersects a walk w if at least one endpoint of e is a vertex of w. Here is how 
this can look like: 


Ww w Ww Ww 


(here, the edges of w are marked with a “w” underneath them) or 


Ww Ww Ww Ww 


(here, the endpoint of e that is a vertex of w happens to be the starting point of 


w) or 
e 
SK 5 oo 


(here, both endpoints of e happen to be vertices of w). Be careful with such 
pictures, though: A walk doesn’t have to be a path; it can visit a vertex any 
number of times! 
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Lemma 3.4.7. Let G be a connected multigraph. Let w be a walk of G. As- 
sume that there exists an edge of G that is not an edge of w. 
Then, there exists an edge of G that is not an edge of w but intersects w. 


Proof. We assumed that there exists an edge of G that is not an edge of w. Pick 
such an edge, and call it f. 

A “w-f-path” will mean a path from a vertex of w to an endpoint of f. Such 
a path clearly exists, since G is connected. Thus, we can pick a shortest such 
path. If this shortest path has length 0, then we are done (since f intersects w in 
this case). If not, we consider the first edge of this path. This first edge cannot 
be an edge of w, because otherwise we could remove it from the path and get 
an even shorter w-f-path. But it clearly intersects w. So we have found an edge 
of G that is not an edge of w but intersects w. This proves the lemma. O 


Lemma 3.4.8. Let G be a multigraph such that each vertex of G has even 
degree. Let w be a longest trail of G. Then, w is a closed walk. 


Proof. Assume the contrary. Let u be the starting point and v the ending point 
of w. Since we assumed that w is not a closed walk, we thus have u Æ v. 

Consider the edges of w that contain v. Such edges are of two kinds: those 
by which w enters v (this means that v comes immediately after this edge in 
w), and those by which w leaves v (this means that v comes immediately before 
this edge in w). Except for the very last edge of w, each edge of the former 
kind is immediately followed by an edge of the latter kind; conversely, each 
edge of the latter kind is immediately preceded by an edge of the former kind 
(since w starts at the vertex u, which is distinct from v). Hence, the walk w has 
exactly one more edge entering v than it has edges leaving v. Thus, the number 
of edges of w that contain v (with loops counting twice) is odd. However, the 
total number of edges of G that contain v (with loops counting twice) is even 
(because it is the degree of v, but we assumed that each vertex of G has even 
degree). So these two numbers are distinct. Thus, there is at least one edge of 
G that contains v but is not an edge of w. 

Fix such an edge and call it f. Now, append f to the trail w at the end. The 
result will be a trail (since f is not an edge of w) that is longer than w. But this 
contradicts the fact that w is a longest trail. Thus, the lemma is proved. o 


We can now finish the proof of the Euler-Hierholzer theorem: 


Proof of Theorem (a) ==>: We proved this back in Example 


<=: Assume that each vertex of G has even degree. 
By Lemma we know that G has a longest trail. Fix such a longest trail, 
and call it w. Then, Lemma shows that w is a closed walk. 


181 oops whose only endpoint is v count as both. 
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We claim that w is Eulerian. Indeed, assume the contrary. Then, there exists 
an edge of G that is not an edge of w. Hence, Lemma [3.4.7] shows that there 
exists an edge of G that is not an edge of w but intersects w. Fix such an edge, 
and call it f. 

Since f intersects w, there exists an endpoint v of f that is a vertex of w. 
Consider this v. Since w is a closed trail, we can WLOG assume that w starts 
and ends at v (since we can otherwise achieve this by rotating")] w). Then, we 
can append the edge f to the trail w. This results in a new trail (since f is not 
an edge of w) that is longer than w. And this contradicts the fact that w is a 
longest trail of G. 

This contradiction proves that w is Eulerian. Hence, w is an Eulerian circuit 
(since w is a closed walk). Thus, the “<=” direction of Theorem (a) is 
proven. 


(b) ==>: Already proved in Example 


<=: Assume that all but at most two vertices of G have even degree. We 
must prove that G has an Eulerian walk. 

If each vertex of G has even degree, then this follows from Theorem8.4.4] (a), 
since every Eulerian circuit is an Eulerian walk. Thus, we WLOG assume that 
not each vertex of G has even degree. In other words, the number of vertices of 
G having odd degree is positive. 

The handshake lemma for multigraphs (i.e., Corollary shows that the 
number of vertices of G having odd degree is even. Furthermore, this number 
is at most 2 (since all but at most two vertices of G have even degree). So this 
number is even, positive and at most 2. Thus, this number is 2. In other words, 
the multigraph G has exactly two vertices having odd degree. Let u and v be 
these two vertices. 

Add a new edge e that has endpoints u and v to the multigraph G (do this 
even if there already is such an edgel). Let G’ denote the resulting multi- 
graph. Then, in G’, each vertex has even degree (since the newly added edge 
e has increased the degrees of u and v by 1, thus turning them from odd to 
even). Moreover, G’ is still connected (since G was connected, and the newly 
added edge e can hardly take that away). Thus, we can apply Theorem [3.4.4] 
(a) to G’ instead of G. As a result, we conclude that G’ has an Eulerian circuit. 
Cutting the newly added edge e out of this Eulerian circuit2!] we obtain an Eu- 


Rotating a closed walk (wo, €1, W1, €2, W2, . . - , €k, Wk) Means moving its first vertex and its first 
edge to the end, i.e., replacing the walk by (w1, e2, W2, €3, W3, . . - , €k, Wk, €1, W1). This always 
results in a closed walk again. For example, if (1,a,2,b,3,c,1) is a closed walk, then we can 
rotate it to obtain (2,b,3,c,1,a,2); then, rotating it one more time, we obtain (3,c,1,4,2,b,3). 

Clearly, by rotating a closed walk several times, we can make it start at any of its vertices. 
Moreover, if we rotate a closed trail, then we obtain a closed trail. 

20This is a time to be grateful for the notion of a multigraph. We could not do this with simple 
graphs! 

*lMore precisely: We rotate this circuit until e becomes its last edge, and then we remove this 
last edge to obtain a walk. 
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lerian walk of G. Hence, G has an Eulerian walk. Thus, the “<=” direction of 
Theorem (b) is proven. O 


Note: If you look closely at the above proof, you will see hidden in it an 
algorithm for finding Eulerian circuits and walks 


Exercise 3.14. Let G be a connected multigraph. Let m be the number of 
vertices of G that have odd degree. Prove that we can add m/2 new edges to 
G in such a way that the resulting multigraph will have an Eulerian circuit. 
(It is allowed to add an edge even if there is already an edge between the 
same two vertices.) 


[Solution: This exercise is Exercise 6 on midterm #1 from my Spring 2017 


course; see the course page for solutions.] 


Exercise 3.15. Let G = (V,E,g) be a multigraph. The line graph L (G) is 
defined as the simple graph (E, F), where 


F = {{e1,e2} € Po (E) | p (e1) N g (e2) £ Ø}. 


(In other words, L (G) is the graph whose vertices are the edges of G, and in 
which two vertices e; and ez are adjacent if and only if the edges e; and e of 
G share a common endpoint.) 

[Example: Here is a multigraph G along with its line graph L (G): 


Note that L (G) does not always determine G uniquely.] 
Assume that |V| > 1. Prove the following: 


22You might be skeptical about this. After all, in order to apply Lemma[B.4.8] we need a longest 
trail, so you might wonder how we can find a longest trail to begin with. 

Fortunately, we don’t need to take Lemma this literally. Our above proof of Lemma 
[3.4.8]can be used even if w is not a longest trail. In this case, however, instead of showing 
that w is a closed walk, this proof may show us a way how to make w longer. In other 
words, by following this proof, we may discover a trail longer than w. In this case, we can 
replace w by this longer trail, and then apply Lemma again. We can repeat this over 
and over again, until we do end up with a closed walk. (This will eventually happen, since 
we know that a trail cannot be longer than the total number of edges of G.) 
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(a) If G has a Hamiltonian path, then L (G) has a Hamiltonian path. 
(b) If G has an Eulerian walk, then L (G) has a Hamiltonian path. 


[Solution: This exercise is Exercise 2 on midterm #1 from my Spring 2017 


course (generalized from simple graphs to multigraphs); see 


for solutions. | 


4. Digraphs and multidigraphs 


4.1. Definitions 


We have so far seen two concepts of graphs: simple graphs and multigraphs. 

For all their differences, these two concepts have one thing in common: The 
two endpoints of an edge are equal in rights. Thus, when defining walks, each 
edge serves as a “two-way road”. Hence, such graphs are good at modelling 
symmetric relations between things. 

We shall now introduce two analogous versions of “graphs” in which the 
edges have directions. These versions are known as directed graphs (short: 
digraphs). In such directed graphs, each edge will have a specified starting 
point (its “source”) and a specified ending point (its “target”). Correspondingly, 
we will draw these edges as arrows, and we will only allow using them in one 
direction (viz., from source to target) when we walk down the graph. Here are 
the definitions in detail: 


Definition 4.1.1. A simple digraph is a pair (V, A), where V is a finite set, 
and where A is a subset of V x V. 


Definition 4.1.2. Let D = (V, A) be a simple digraph. 


(a) The set V is called the vertex set of D; it is denoted by V (D). 


Its elements are called the vertices (or nodes) of D. 


(b) The set A is called the arc set of D; it is denoted by A (D). 
Its elements are called the arcs (or directed edges) of D. 


When u and v are two elements of V, we will occasionally use uv as 
a shorthand for the pair (u,v). Note that this means an ordered pair 
now! 


(c) If (u,v) is an arc of D (or, more generally, a pair in V x V), then u is 
called the source of this arc, and v is called the target of this arc. 
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(d) We draw D as follows: We represent each vertex of D by a point, and 
each arc uv by an arrow that goes from the point representing u to the 
point representing v. 


(e) An arc (u,v) is called a loop (or self-loop) if u = v. (In other words, an 
arc is a loop if and only if its source is its target.) 
Example 4.1.3. For each n € N, we define the divisibility digraph on 
{1,2,...,n} to be the simple digraph (V, A), where V = {1,2,...,n} and 
A= {(i,j) € Vx V | i divides j}. 


For example, for n = 6, this digraph looks as follows: 


(4) 


Note that simple digraphs (unlike simple graphs) are allowed to have loops 
(i.e., arcs of the form (v,7)). 


Definition 4.1.4. A multidigraph is a triple (V, A,), where V and A are 
two finite sets, and y : A —> V x V is a map. 


Definition 4.1.5. Let D = (V, A, y) be a multidigraph. 


(a) The set V is called the vertex set of D; it is denoted by V (D). 


Its elements are called the vertices (or nodes) of D. 


(b) The set A is called the arc set of D; it is denoted by A (D). 


Its elements are called the arcs (or directed edges) of D. 


(c) If a is an arc of D, and if y (a) = (u,v), then the vertex u is called the 
source of a, and the vertex v is called the target of a. 


(d) We draw D as follows: We represent each vertex of D by a point, and 
each arc a by an arrow that goes from the point representing u to the 
point representing v, where (u,v) = 4 (a). 
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Example 4.1.6. Here is a multidigraph: 


(5) 


Formally speaking, this multidigraph is the triple (V, A, y), where V = 
{1,2,3,4,5} and A = {a,b,c,d,e, f,g,h} and y (a) = (1,2) and y (b) = (2,5) 


and so on. 


Thus, simple digraphs and multidigraphs are analogues of simple graphs 
and multigraphs, respectively, in which the edges have been replaced by arcs 
(“edges endowed with a direction”). The analogy is perfect but for the fact 
that simple graphs forbid loops but simple digraphs allow loops (but different 
authors have different opinions on this). 


Convention 4.1.7. The word “digraph” means either “simple digraph” or 
“multidigraph”, depending on the context. 


The word “digraph” was originally a shorthand for “directed graph”, but 
by now it is a technical term that is perfectly understood by everyone in the 
subject. (It is also understood by linguists, but in a rather different way.) 


4.2. Outdegrees and indegrees 


What can we do with digraphs? Many of the things we have done with graphs 
can be modified to work with digraphs (although not all their properties will 
still hold). For example, the notion of the degree of a vertex in a graph has the 
following two counterpart notions for digraphs: 


Definition 4.2.1. Let D be a digraph with vertex set V. (This can be either a 
simple digraph or a multidigraph.) Let v € V be any vertex. Then: 


(a) The outdegree of v denotes the number of arcs of D whose source is v. 
This outdegree is denoted deg" v. 


(b) The indegree of v denotes the number of arcs of D whose target is v. 
This indegree is denoted deg” v. 
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Example 4.2.2. In the divisibility digraph on {1,2,3,4,5,6} (see for a 
drawing), we have 


deg*1=6, deg 1=1, deg" 2 = 3, deg 2=2, 
degt 3 =2, deg 3=2, deg’ 4=1, deg 4=3, 
deg'5=1, deg 5=2, deg’ 6=1, deg 6=4. 


Recall Euler’s result (Proposition 3.3.2) saying that in a graph, the sum of all 
degrees is twice the number of edges. Here is an analogue of this result for 
digraphs: 


Proposition 4.2.3 (diEuler). Let D be a digraph with vertex set V and arc set 


A. Then, 
}_ deg™v = }_ deg v= |A]. 
vEV 


vEV 


Proof. By the definition of an outdegree, we have 
degt v = (the numter of arcs of D whose source is v) 
for each v € V. Thus, 


D deg” v = D (the number of arcs of D whose source is v) 
vEV vEV 


= (the number of all arcs of D) 


since each arc of D has exactly one source, 
and thus is counted exactly once in the sum 


= |A]. 
Similarly, $} deg” v = |A]. O 
vEV 


(“diEuler” is not a real mathematician; I just gave that moniker to Proposition 
in order to stress its analogy with Euler’s 1736 result.) 


4.3. Subdigraphs 


Just as we defined subgraphs of a multigraph, we can define subdigraphs (or 
“submultidigraphs”, to be very precise) of a digraph: 


Definition 4.3.1. Let D = (V, A, y) be a multidigraph. 


(a) A submultidigraph (or, for short, subdigraph) of D means a multi- 
digraph of the form E = (W,B,x), where W C V and B C A and 
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xX = Y |g. In other words, a submultidigraph of D means a multidi- 
graph E whose vertices are vertices of D and whose arcs are arcs of D 
and whose arcs have the same sources and targets in E as they have in 
D. 


(b 


Å 


Let S be a subset of V. The induced subdigraph of D on the set S 
denotes the subdigraph 


(S, A', plx) 
of D, where 
A' := {a € A | both the source and the target of a belong to S}. 


In other words, it denotes the subdigraph of D whose vertices are the el- 
ements of S, and whose arcs are precisely those arcs of D whose sources 
and targets both belong to S. We denote this induced subdigraph by 
D [S|]. 


(c 


Å 


An induced subdigraph of D means a subdigraph of D that is the 
induced subdigraph of D on S for some S C V. 


4.4. Conversions 
4.4.1. Multidigraphs to multigraphs 


Any multidigraph D can be turned into an (undirected) graph G by “removing 
the arrowheads” (aka “forgetting the directions of the arcs”): 


Definition 4.4.1. Let D be a multidigraph. Then, Dnd will denote the multi- 
graph obtained from D by replacing each arc with an edge whose endpoints 
are the source and the target of this arc. Formally, this is defined as follows: 
If D = (V, A, 4), then pre = (VA, p), where the map ọ : A > Py2(V) 
sends each arc a € A to the set of the entries of y (a) (that is, to the set 
consisting of the source of a and the target of a). 


For example, if D is the multidigraph from (6), then D4 is the following 
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multigraph: 


4.4.2. Multigraphs to multidigraphs 


We have just seen how to turn any multidigraph D into a multigraph D™4 by 
forgetting the directions of the arcs. 

Conversely, we can turn a multigraph G into a multidigraph G'd! by “du- 
plicating” each edge (more precisely: turning each edge into two arcs with 
opposite orientations). Here is a formal definition: 


Definition 4.4.2. Let G = (V,E,@) be a multigraph. For each edge e € E, 
let us choose one of the endpoints of e and call it se; the other endpoint will 
then be called te. (If e is a loop, then we understand te to mean se.) 

We then define Ghid! to be the multidigraph (V, E x {1,2}, w), where 
the map y : E x {1,2} — V x V is defined as follows: For each edge e € E, 
we set 


yp (e, 1) = (Se, te) and yp (e, 2) == (te, Se) . 
We call Gidi! the bidirectionalized multidigraph of G. 


Note that the map y depends on our choice of s,’s (that is, it depends on 
which endpoint of an edge e we choose to be se). This makes the definition of 
Gbidir non-canonical; I don’t know if there is a good way to fix this. Fortunately, 
all choices of s,’s will lead to mutually isomorphic multidigraphs G4", (The 
notion of isomorphism for multidigraphs is exactly the one that you expect.) 


Example 4.4.3. If 
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then 


(Here, for example, we have chosen s4 to be 2, so that tg = 3 and y (a,1) = 
(2,3) and w (a,2) = (3,2).) Yes, even the loops of G are duplicated in Gbidit | 


The operation that assigns a multidigraph G'“*" to a multigraph G is injective 
— i.e., the original graph G can be uniquely reconstructed from Gd, This is 
in stark difference to the operation D ++ D™4, which destroys information (the 
es und 


directions of the arcs). Note that the multigraph ( is not isomorphic 


to G, since each edge of G is doubled in (Gbidir) 9nd 


4.4.3. Simple digraphs to multidigraphs 


Next, we introduce another operation: one that turns simple digraphs into 
multidigraphs. This is very similar to the operation G > G™'t that turns 
simple graphs into multigraphs, so we will even use the same notation for it. 
Its definition is as follows: 


Definition 4.4.4. Let D = (V, A) bea simple digraph. Then, the correspond- 
ing multidigraph D™" is defined to be the multidigraph 


(Vad; 


where 1: A — V x V is the map sending each a € A to a itself. 


Example 4.4.5. If 
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then 


4.4.4. Multidigraphs to simple digraphs 


There is also an operation D ++ D*®P that turns multidigraphs into simple 
digraphs!) 


Definition 4.4.6. Let D = (V,A,) be a multidigraph. Then, the underlying 
simple digraph D*'™P of D means the simple digraph 


(V, {4 (a) | a€ A}). 


In other words, it is the simple digraph with vertex set V in which there is an 
arc from u to v if there exists an arc from u to v in D. Thus, D5™? is obtained 
from D by “collapsing” parallel arcs (i.e., arcs having the same source and 
the same target) to a single arc. 


Example 4.4.7. If 


then 


w- OÝ oe 


231 will use a notation that I probably should have introduced before: If u and v are two vertices 
of a digraph, then an “arc from u to v” means an arc with source u and target v. 


An introduction to graph theory, version August 2, 2023 page 109 


Note that the arcs c and d have not been “collapsed” into one arc, since they 
do not have the same source and the same target. Likewise, the loop g has 
been preserved (unlike for undirected graphs). 


4.4.5. Multidigraphs as a big tent 


A takeaway from this all is that multidigraphs are the “most general” notion of 
graphs we have introduced so far. Indeed, using the operations we have seen 
so far, we can convert every notion of graphs into a multidigraph: 


e Each simple graph becomes a multigraph via the G ++ G™ operation. 
e Each multigraph, in turn, becomes a multidigraph via the D > pede 
operation. 


e Each simple digraph becomes a multidigraph via the D ++ D™"* opera- 
tion. 


Since all three of these operations are injective (i.e., lose no information), we 
thus can encode each of our four notions of graphs as a multidigraph. Con- 
sequently, any theorem about multidigraphs can be specialized to the other 
three types of graphs. This doesn’t mean that any theorem on any other type 
of graphs can be generalized to multidigraphs, though (e.g., Mantel’s theorem 
holds only for simple graphs) — but when it can, we will try to state it at the 
most general level possible, to avoid doing the same work twice. 


4.5. Walks, paths, closed walks, cycles 
4.5.1. Definitions 


Let us now define various kinds of walks for simple digraphs and for multidi- 
graphs. 

For simple digraphs, we imitate the definitions from Sections [2.9] and [2.10]as 
best as we can, making sure to require all arcs to be traversed in the correct 
direction: 


Definition 4.5.1. Let D be a simple digraph. Then: 


(a) A walk (in D) means a finite sequence (vo, v1, ...,U%) of vertices of D 
(with k > 0) such that all of the pairs v9v1, 0102, V203, ..., Up_1U, are 
arcs of D. (The latter condition is vacuously true if k = 0.) 


(b) If w = (v9,U1,..-, Ux) is a walk in D, then: 


e The vertices of w are defined to be vg, 01,..., Uk- 
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(c) 


(d) 


(e) 


(f) 


e The arcs of w are defined to be the pairs 
0001, 0102, 0203, ..., Ok—1 0k. 

e The nonnegative integer k is called the length of w. (This is the 
number of all arcs of w, counted with multiplicity. It is 1 smaller 
than the number of all vertices of w, counted with multiplicity.) 


e The vertex vo is called the starting point of w. We say that w starts 
(or begins) at vo. 


e The vertex v; is called the ending point of w. We say that w ends 
at Ur. 


A path (in D) means a walk (in D) whose vertices are distinct. In other 
words, a path means a walk (vo, v1, ..., Vk) such that vo, 01,...,0% are 
distinct. 


Let p and q be two vertices of D. A walk from p to q means a walk that 
starts at p and ends at g. A path from p to q means a path that starts at 
p and ends at q. 


A closed walk of D means a walk whose first vertex is identical with 
its last vertex. In other words, it means a walk (wo,W1,...,W x) with 
Wo = Wz. Sometimes, closed walks are also known as circuits (but 
many authors use this latter word for something slightly different). 


A cycle of D means a closed walk (wo, w1, . .., Wg) such that k > 1 and 
such that the vertices wo, W1,...,W,_1 are distinct. 


Note that we replaced the condition k > 3 by k > 1 in the definition of a 
cycle, since simple digraphs can have loops. Fortunately, with the arcs being 
directed, we no longer have to worry about the same arc being traversed back 
and forth, so we need no extra condition to rule this out. 


Example 4.5.2. Consider the simple digraph 


(2) 
D= Q) 3) O. 


Then, (1,2,3,4) and (1,3,4) are two walks of D, and these walks are paths. 
But (2,3,1) is not a walk (since you cannot use the arc 13 to get from 3 to 1). 
This digraph D has no cycles, and its only closed walks have length 0. 
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Example 4.5.3. Consider the simple digraph 


Then, (1,2,3,1) and (3,4,3) and (4,4) are cycles of D. Moreover, 
(1,2,3,4,3,1) is a closed walk but not a cycle. 


Now let’s define the same concepts for multidigraphs, by modifying the anal- 
ogous definitions for multigraphs we saw in Definition [3.1.4 


Definition 4.5.4. Let D = (V, A, y) be a multidigraph. Then: 
(a) A walk in D means a list of the form 
(vo, 41, 01,42, 02,..-, Ak, Uk) (with k > 0), 


where vo, 01,..-,U, are vertices of D, where a1, 42,...,a, are arcs of D, 
and where each i € {1,2,...,k} satisfies 


y (ai) = (0j-1, 0%) 


(that is, each arc a; has source v;_; and target v;). Note that we have 
to record both the vertices and the arcs in our walk, since we want the 
walk to “know” which arcs it traverses. 


The vertices of a walk (vo, 41,01, 42, V2, . . -, Ak, Uk) Are Vo, V1,- .-, Uk; the 
arcs of this walk are a1,a2,...,aņp. This walk is said to start at vọ and 
end at v; it is also said to be a walk from vo to v;. Its starting point is 
vo, and its ending point is v;. Its length is k. 


(b) A path means a walk whose vertices are distinct. 


(c) A closed walk (or circuit) means a walk (vo, 41,01, 42, V2, . . - , Ak, Vk) with 
Uk = Vo. 


(d) A cycle means a closed walk (vo, 41, v1, 42, V2, . . . , Ak, Vk) Such that 


e the vertices vo, V1,...,Ux_1 are distinct; 


e we have k > 1. 


(This automatically implies that the arcs a1, a2,...,a, are distinct, since 
each arc a; has source v;_}.) 
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Example 4.5.5. Consider the multidigraph 


Then, (1,4,2,b,3,d,1) and (3,d,1,c,3) and (4,g,4) are three cycles of D, 
whereas (3, d,1,a,2,b,3,d,1,c,3) is a circuit but not a cycle. 


4.5.2. Basic properties 


Now, let us see which properties of walks, paths, closed walks and cycles re- 
main valid for digraphs. 

In Proposition we saw how two walks in a simple graph could be com- 
bined (“spliced together”) if the ending point of the first is the starting point of 
the second. In Proposition 3.3.7, we generalized this to multigraphs. The same 
holds for multidigraphs: 


Proposition 4.5.6. Let D be a multidigraph. Let u, v and w be three ver- 
tices of D. Let a = (do,e1,41,...,ek, ax) be a walk from u to v. Let 
b = (bo, fi, 01,..-, fe, bo) be a walk from v to w. Then, 


(ao, 1, 41,. : «Ch Oke fi, bis fos bo,. : rfo bo) 
= (ao, €1,41,. «+ ,Ap_1,€ky bo, f1,b1,. A end De by) 
= (ao, €1,41,- . .,Ak—1, êk V, fi» b1,. : ir by) 


is a walk from u to w. This walk shall be denoted a * b. 
Proof. The same (trivial) argument as for undirected graphs works here. oO 


However, unlike for undirected graphs, we can no longer reverse walks or 
paths in digraphs. Thus, it often happens that there is a walk from u to v, but 
no walk from v to u. 


Reducing a walk to a path (as we did in Proposition for simple graphs 
and in Proposition for multigraphs) still works for multidigraphs: 


Proposition 4.5.7. Let D be a multidigraph. Let u and v be two vertices of D. 
Let a be a walk from u to v. Let k be the length of a. Assume that a is not a 
path. Then, there exists a walk from u to v whose length is smaller than k. 
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Corollary 4.5.8 (When there is a walk, there is a path). Let D be a multidi- 
graph. Let u and v be two vertices of D. Assume that there is a walk from u 
to v of length k for some k € N. Then, there is a path from u to v of length 
<k. 


The proofs of these facts are the same as for multigraphs. 


The following proposition is an analogue of Proposition [2.10.4| for multidi- 
graphs: 


Proposition 4.5.9. Let D be a multidigraph. Let w be a walk of D. Then, w 
either is a path or contains a cycle (i.e., there exists a cycle of D whose arcs 
are arcs of w). 


Proof. This follows by the same argument as Proposition [2.10.4 o 


Given a multidigraph D and two vertices u and v of D, we can pose the 
same five algorithmic questions (Questions 1, 2, 3, 4 and 5) that we posed for 
a simple graph G in Subsection |2.9.4| As with multigraphs, the same answers 
that we gave back then are still valid in our new setting, as long as we replace 
“neighbors of v” by “in-neighbors of v” (that is, vertices w such that D has an 
arc from w to v), and as long as we keep track of the arcs in our paths or walks. 


4.5.3. Exercises 


Exercise 4.1. Let D be a multidigraph with at least one vertex. Prove the 
following: 


(a) If each vertex v of D satisfies deg" v > 0, then D has a cycle. 


(b) If each vertex v of D satisfies deg" v = deg’ v = 1, then each vertex of 
D belongs to exactly one cycle of D. Here, two cycles are considered to 
be identical if one can be obtained from the other by cyclic rotation. 


Exercise 4.2. Let p be a prime number. Let (a1, a2,43,...) be a sequence of 
integers that is periodic with period p (that is, that satisfies a; = a;,, for each 
i > 0). Assume that a, + a2 +--+ + ap is not divisible by p. Prove that there 
exists ani € {1,2,...,p} such that none of the p numbers 


Aj, Ai + Aj+1, Aj + Aji41 F Ait, -7 Ai t Aiya 1 +++ H Aipa 


(that is, of the p sums a; + ai41 +--+ + aj for i < j < i + p) is divisible by p. 
[Remark: This would be false if p was not prime. For instance, for p = 4, 
the sequence (0,2,2,2,0,2,2,2,...) would be a counterexample.] 


[Hint: Use Exercise [4.1] (a). What is the digraph, and why does it have a 
cycle?] 
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Exercise 4.3. Let D = (V, A, y) be a multidigraph. 


For two vertices u and v of D, we shall write u Š v if there exists a path 
from u to v. 

A root of D means a vertex u € V such that each vertex v € V satisfies 
uŠ v. 

A common ancestor of two vertices u and v means a vertex w € V such 
that w Š u and w Š v. 

Assume that D has at least one vertex. Prove that D has a root if and only 
if every two vertices in D have a common ancestor. 


The following exercise is both a directed analogue and a generalization of Man- 
tel’s theorem (Theorem |2.4.6): 


Exercise 4.4. Let D be a simple digraph with n vertices and a arcs. Assume 
that D has no loops, and that we have a > n2/2. Prove the following: 


(a) The digraph D has a cycle of length 3. 


(b) We define an enhanced 3-cycle to be a triple (u, v, w) of distinct vertices 
of D such that all four pairs (u,v), (v, w), (w,u) and (u, w) are arcs of 
D. Then, the digraph D has an enhanced 3-cycle. 


Exercise 4.5. Let D = (V, A) be a simple digraph that has no cycles. 

If v = (v1, v2, . . -,Un) is a list of vertices of D (not necessarily a walk!), then 
a back-cut of v shall mean an arc a € A whose source is v; and whose target 
is v; for some i,j € {1,2,...,n} satisfying i > j. (Colloquially speaking, a 
back-cut of v is an arc of D that leads from some vertex of v to some earlier 
vertex of v.) 

A list v = (v1, v2, ..., Un) of vertices of D is said to be a toposort24 of D if 
it contains each vertex of D exactly once and has no back-cuts. 

Prove the following: 


(a) The digraph D has at least one toposort. 


(b) If D has only one toposort, then this toposort is a Hamiltonian path of 
D. 


Here, a Hamiltonian path in D means a walk of D that contains each 
vertex of D exactly once. 
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[Example: For example, the digraph 


has two toposorts: (3,2,1,4) and (3,2,4,1).] 


Exercise 4.6. Let n be a positive integer. Let D be a digraph that has no cycles 
of length < 2. Assume that D has at least 2”~! vertices. Prove that D has an 
induced subdigraph that has n vertices and has no cycles. 


4.5.4. The adjacency matrix 


A simple way to find the number of walks from a given vertex to a given vertex 
in a multidigraph is provided by matrix algebra: 


Theorem 4.5.10. Let D = (V,A,) be a multidigraph, where V = 
{1,2,...,n} for some n € N. 

If M is any matrix, and if į and j are two positive integers, then M; ; shall 
denote the (i, /)-th entry of M (that is, the entry of M in the i-th row and the 
j-th column). 

Let C be the n x n-matrix (with real entries) defined by 


Cj; = (the number of all arcs a € A with source i and target j) 
for all i,j € V. 


Let k € N, and let i,j € V. Then, (C*); equals the number of all walks of 
D having starting point i, ending point j and length k. 


| Remark 4.5.11. The matrix C in Theorem [4.5.10] is known as the adjacency 


24This is short for “topological sorting”. I don’t know where this name comes from. 
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matrix of D. For example, if the multidigraph is 


then its adjacency matrix is 


oro; 
ooo rF 
oO OrF rR 


and thus Theorem [4.5.10] yields (among other things) that the (1,3)-rd entry 
(C*), , of its k-th power C* equals the number of all walks of D having 
starting point 1, ending point 3 and length k. 

The adjacency matrix of a multidigraph D determines D up to the iden- 
tities of the arcs, and thus is often used as a convenient way to encode a 
multidigraph. 


Proof of Theorem|4.5.10| Forget that we fixed i, j and k. We want to prove the 
following claim: 


Claim 1: Let i € V andj € V and k € N. Then, 


© __ = (the number of walks from i to j that have length k). 
1,] 


Before we prove this claim, let us recall that C is the adjacency matrix of D. 
Thus, for each i € V and j € V, we have 


Cj; = (the number of all arcs a € A with source i and target j) 


(by the definition of the adjacency matrix). In other words, for each i € V and 
j € V, we have 
Cj; = (the number of arcs from i to j), 
where we agree that an “arc from i to j” means an arc a € A with source i and 
target j. 
Renaming i as w in this statement, we obtain the following: For each w € V 
and j € V, we have 


Cu,j = (the number of arcs from w to j). (6) 
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Let us also recall that any two n x n-matrices M and N satisfy 


n 
(MN), = | MiwNuj (7) 


w=1 


for any i € V andj € V. (Indeed, this is just the rule for how matrices are 
multiplied.) 

We can now prove Claim 1: 

[Proof of Claim 1: We shall prove Claim 1 by induction on k: 

Induction base: We shall first prove Claim 1 for k = 0. 

Indeed, let i € V and j € V. The 0-th power of any n x n-matrix is defined to 
be the n x n identity matrix In; thus, C = I. Hence, 


iy ny, 
CC). = (p): = í 8 
( i Uni, e ifi#j K 


(by the definition of the identity matrix). 

On the other hand, how many walks from i to j have length 0 ? A walk 
that has length 0 must consist of a single vertex, which is simultaneously the 
starting point and the ending point of this walk. Thus, a walk from i to j that 
has length 0 exists only when i = j, and in this case there is exactly one such 
walk (namely, the walk (i)). Hence, 


L, ifizi: 
(the number of walks from i to j that have length 0) = E P : J J 
ye, 


Comparing this with (8), we conclude that 


a __ = (the number of walks from i to j that have length 0). (9) 
1,] 


Now, forget that we fixed i and j. We thus have proven (9) for any i € V and 
j € V. In other words, Claim 1 holds for k = 0. Thus, the induction base is 
complete. 

Induction step: Let g be a positive integer. Assume that Claim 1 holds for 
k = g — 1. We must show that Claim 1 holds for k = g as well. 

We have assumed that Claim 1 holds for k = g — 1. In other words, for any 
i € V andj € V, we have 


C871) = (the number of walks from i to j that have length g — 1). 
ij J gth § 


Renaming j as w in this statement, we obtain the following: For any i € V and 
w € V, we have 


(c=) _ = (the number of walks from i to w that have length g — 1). (10) 
i,w 
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Each walk from i to j that has length g has the form 
w= (vo, 41,01,42,02,..., Ag—1, Ug—1, ag, Vg) 


for some vertices vo, v1, ...,Ug of D and some arcs 4,42,...,ag of D satisfying 
vo =i and vg = j and (4 (ay) = (On-1,0n) for all h € {1,2,...,g9}). Thus, each 
such walk w can be constructed by the following algorithm: 


e First, we choose a vertex w of D to serve as the vertex v,_ (that is, as the 
penultimate vertex of the walk w). This vertex w must belong to V. 


e Now, we choose the vertices v9,01,...,Ug—1 (that is, all vertices of our 
walk except for the last one) and the arcs 41,42,...,@g—1 (that is, all arcs 
of our walk except for the last one) in such a way that v,_1 = w. This is 
tantamount to choosing a walk (Vo, 41,01, A2, V2, - - -, Ag—1, Ug—1) from i to 
w that has length g — 1. This choice can be made in (C~') iw Many ways 
(because shows that the number of walks from 7 to w that have length 
8&7 1 is (Ctp 

e We have now determined all but the last vertex and all but the last arc of 
our walk w. We set the last vertex vg of our walk to be j. (This is the only 
possible option, since our walk w has to be a walk from 7 to j.) 


e We choose the last arc ag of our walk w. This arc ag must have source V,_ 
and target v,; in other words, it must have source w and target j (since 
Vg_1 = w and vz = j). In other words, it must be an arc from w to j. Thus, 
it can be chosen in C,,,; many ways (because (6) shows that the number of 
arcs from w to j is Cw,j). 


Conversely, of course, this algorithm always constructs a walk from i to j 
that has length g, and different choices in the algorithm lead to distinct walks. 
Thus, the total number of walks from 1 to j that have length g equals the total 


number of choices in the algorithm. But the latter number is )° (C8 =t) Cw,j 
wEV 


(since the algorithm first chooses a w € V, then involves a step with (C871); 
choices, and then involves a step with C,,; choices). Hence, the total number of 
walks from i to j that have length gis $} (C8~')._, Cw. In other words, 

wEV 


i,w 


1,W 


(the number of walks from i to j that have length g) = } (cs a 
wEV 


Cip 


1,W 
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Comparing this with 


n 
a i,j J “g Lj Xu (cr) Cw,j 


(by (applied to M = C8“! and N = e 


= > em), Cw,j (since {1,2,...,n} = V), 
iw 


wEV 


we obtain 
Cs)... = (the number of walks from i to j that have length ¢). (11) 
ij J BN § 


Now, forget that we fixed i and jf. We thus have proven for any i € V 
and j € V. In other words, Claim 1 holds for k = g. Thus, the induction step is 
complete. Hence, Claim 1 is proven by induction. ] 

Theorem [4.5.10] follows immediately from Claim 1. o 


Exercise 4.7. Let E be the following multidigraph: 


E = 


Letn € N. Compute the number of walks from 1 to 1 having length n. 


4.6. Connectedness strong and weak 


We defined the “path-connected” relation for undirected graphs using the ex- 
istence of paths (see Definition [2.9.8). For a digraph, however, the relations 
“there is a walk from u to v” and “there is a walk from v to u” are (in general) 
distinct and non-symmetric, so I prefer not to give them a symmetric-looking 
symbol such as ~p. Instead, we define strong path-connectedness to mean the 
existence of both walks: 


Definition 4.6.1. Let D be a multidigraph. We define a binary relation ~p on 
the set V (D) as follows: For two vertices u and v of D, we shall have u ~p v 
if and only if there exists a walk from u to v in D and there exists a walk 
from v to u in D. 

This binary relation ~p is called “strong path-connectedness”. When two 
vertices u and v satisfy u ~p v, we say that “u and v are strongly path- 
connected”. 
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Example 4.6.2. Let D be as in Example Then, 1 ~p 2, because there 
exists a walk from 1 to 2 in D (for instance, (1,a,2)) and there also exists a 
walk from 2 to 1 in D (for instance, (2,b,3,d,1)). However, we don’t have 
3 ~p 4. Indeed, while there exists a walk from 3 to 4 in D, there exists no 
walk from 4 to 3 in D. 


Proposition 4.6.3. Let D be a multidigraph. Then, the relation ~p is an 
equivalence relation. 


Proof. Easy, like for simple graphs. O 
Again, we can replace “walk” by “path” in the definition of the relation ~p: 


Proposition 4.6.4. Let D be a multidigraph. Let u and v be two vertices of D. 
Then, u ~p v if and only if there exist a path from u to v and a path from v 
to u. 


Proof. Easy, like for simple graphs. O 


Definition 4.6.5. Let D be a multidigraph. The equivalence classes of the 
equivalence relation ~p are called the strong components of D. 


Definition 4.6.6. Let D be a multidigraph. We say that D is strongly con- 
nected if D has exactly one strong component. 


Thus, a multidigraph D is strongly connected if and only if it has at least one 
vertex and there is a path from any vertex to any vertex 

In comparison, here is a weaker notion of connected components and con- 
nectedness: 


Definition 4.6.7. Let D be a multidigraph. Consider its underlying undi- 
rected multigraph D™4, The components of this undirected multigraph D™4 
(that is, the equivalence classes of the equivalence relation ~puna) are called 
the weak components of D. We say that D is weakly connected if D has 
exactly one weak component (i.e., if DY4 is connected). 


Example 4.6.8. Let D be the following simple digraph: 


Q OOQ 
o- | 
SF ORKOJ 


25Some authors use the word “diconnected” for “strongly connected”. As this word is just a 
single letter away from “disconnected”, I cannot recommend it. 
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We treat D as a multidigraph (namely, D™"), 

The weak components of D are {1,2,3,4,5} and {6,7}. 

The strong components of D are {1}, {2}, {3,4,5}, {6} and {7}. (Indeed, 
for example, we have 1 #p 2 #p 3 but 3 ~p 4 ~p 5.) 

So D is neither strongly nor weakly connected, but has more strong than 
weak components. 


Example 4.6.9. The digraph from Example is weakly connected, but not 
at all strongly connected (indeed, each of its strong components has size 1). 
The digraph from Example on the other hand, is strongly connected. 


| Proposition 4.6.10. Any strongly connected digraph is weakly connected. 


Proof. Let D be a multidigraph. Then, any walk of D is (or, more precisely, 
gives rise to) a walk of D™4, Hence, if two vertices u and v of D are strongly 
path-connected in D, then they are path-connected in D™4, Therefore, if D is 
strongly connected, then D""¢ is connected, but this means that D is weakly 
connected. oO 


Exercise 4.8. Let D be a multidigraph. Prove that the strong components of 
D are the weak components of D if and only if each arc of D is contained in 
at least one cycle. 


Let us take a look at what bidirectionalization (i.e., the operation G +> Gua 
that sends a multigraph G to the multidigraph Gdi") does to walks, paths, 
closed walks and cycles: 


Proposition 4.6.11. Let G be a multigraph. Then: 


(a) The walks of G are “more or less the same as” the walks of the multi- 
digraph Gřidir, More precisely, each walk of G gives rise to a walk of 
Gbidir (with the same starting point and the same ending point), and 
conversely, each walk of Gi" gives rise to a walk of G. If G has no 
loops, then this is a one-to-one correspondence (i.e., a bijection) be- 
tween the walks of G and the walks of GPidi, 


(b) The paths of G are “more or less the same as” the paths of the multi- 
digraph Gidi", This is always a one-to-one correspondence, since paths 
cannot contain loops. 


(c) The closed walks of G are “more or less the same as” the closed walks 
of the multidigraph GP, 


(d) The cycles of G are not quite the same as the cycles of Gidi", In fact, if e 
is an edge of G with two distinct endpoints u and v, then (u,e, v,e, u) is 
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not a cycle of G, but either (u, (e,1) ,v, (e,2) ,u) or (u, (e,2),0, (e,1) ,u) 
is a cycle of GPd" (this is best seen on a picture: G has the edge 


(e,1) 


Le Oe) 
whereas GPi4it has the arc-pair (e,2) ), so Gbidir 


usually has more cycles than G has. But it is true that each cycle of G 
gives rise to a cycle of Gdr, 


Exercise 4.9. Let D = (V,E, y) be a multidigraph. 

Let A, B and C be three subsets of V such that the induced subdigraphs 
D [A], D [B] and D |C] are strongly connected. 

A cycle of D will be called eclectic if it contains at least one arc of D [A], at 
least one arc of D |B] and at least one arc of D [C] (although these three arcs 
are not required to be distinct). 

Prove the following: 


(a) If the sets BNC, CN A and ANB are nonempty, but AN BMC is empty, 
then D has an eclectic cycle. 


(b) If the induced subdigraphs D|BMC], D[CN A] and D[ANB] are 
strongly connected, but the induced subdigraph D [AN BMC] is not 
strongly connected, then D has an eclectic cycle. 


[Note: Keep in mind that the multidigraph with 0 vertices does not count 
as strongly connected.] 


[Solution: This is a generalization of Exercise 7 on midterm #2 from my 


Spring 2017 course; see the course page for solutions.] 


4.7. Eulerian walks and circuits 


We have studied Eulerian walks and circuits for (undirected) multigraphs in 
Section 3.4] Let us now define analogous concepts for multidigraphs: 


Definition 4.7.1. Let D be a multidigraph. 
(a) A walk of D is said to be Eulerian if each arc of D appears exactly once 


in this walk. 


(In other words: A walk (v0, 41,01, 42, V2, . . .,Aķ, Uk) Of D is said to be 
Eulerian if for each arc a of D, there exists exactly one i € {1,2,...,k} 
such that a = qj.) 
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(b) An Eulerian circuit of D means a circuit (i.e., closed walk) of D that is 
Eulerian. 


The Euler—Hierholzer theorem gives a necessary and sufficient criterion for a 
multigraph to have an Eulerian circuit or walk. For multidigraphs, there is an 
analogous result: 


Theorem 4.7.2 (diEuler, diHierholzer). Let D be a weakly connected multidi- 
graph. Then: 


(a) The multidigraph D has an Eulerian circuit if and only if each vertex v 
of D satisfies degt v = deg” v. 


(b) The multidigraph D has an Eulerian walk if and only if all but two 
vertices v of D satisfy deg* v = deg” v, and the remaining two vertices 
v satisfy |deg* v — deg v| <1. 


I Exercise 4.10. Prove Theorem [4.7.2 


Incidentally, the “each vertex v of D satisfies deg* v = deg” v” condition has 
a name: 


Definition 4.7.3. A multidigraph D is said to be balanced if each vertex v of 
D satisfies deg* v = deg” v. 


So balancedness is necessary and sufficient for the existence of an Eulerian 
circuit in a weakly connected multidigraph. 
The following proposition is obvious: 


Proposition 4.7.4. Let G be a multigraph. Then, the multidigraph Gi4i" is 
balanced. 


Proof. The definition of G>'4'" yields that each vertex v of G4" satisfies deg* v = 
degv and deg” v = degv, where deg v denotes the degree of v as a vertex of 
G. Hence, each vertex v of GPd" satisfies deg* v = degv = deg v. In other 
words, Gidi! is balanced. Oo 


Combining this proposition with Theorem (a), we can obtain a curious 
fact about undirected(!) multigraphs: 


Theorem 4.7.5. Let G be a connected multigraph. Then, the multidigraph 
Għidir has an Eulerian circuit. In other words, there is a circuit of G that 
contains each edge exactly twice, and uses it once in each direction. 
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Proof. The multidigraph G'd" is balanced (by Proposition 4.7.4) and weakly 
connected (this follows easily from the connectedness of G). Hence, Theorem 
(a) can be applied to D = G4", Thus, G4" has an Eulerian circuit. 
Reinterpreting this circuit as a circuit of G, we obtain a circuit of G that con- 
tains each edge exactly twice, and uses it once in each direction. This proves 
Theorem [4.7.5 E 


4.8. Hamiltonian cycles and paths 


We can define Hamiltonian paths and cycles for simple digraphs in the same 
way as we defined them for simple graphs: 


Definition 4.8.1. Let D = (V, A) be a simple digraph. 


(a) A Hamiltonian path in D means a walk of D that contains each vertex 
of D exactly once. Obviously, it is a path. 


(b) A Hamiltonian cycle in D means a cycle (vo, v1,..., Vk) of D such that 
each vertex of D appears exactly once among vo, V1,- -, Uk—1- 


Convention 4.8.2. In the following, we will abbreviate: 


e “Hamiltonian path” as “hamp”; 


e “Hamiltonian cycle” as “hamc”. 


We might wonder what can be said about hamps and hamcs for digraphs. Is 
there an analogue of Ore’s theorem? The answer is “yes”, but it is significantly 
harder to prove: 


Theorem 4.8.3 (Meyniel). Let D = (V,A) be a strongly connected loopless 
simple digraph with n vertices. Assume that for each pair (u,v) € V x V of 
two vertices u and v satisfying u Æ v and (u,v) ¢ A and (v,u) ¢ A, we have 
degu + degv > 2n — 1. Here, deg w means degt w+deg w. Then, D has a 
hamc. 


For the (rather complicated) proof of this, see [BonTho77] or |Berge91; §10.3, 


Theorem 7]. Note that the “strongly connected” condition is needed. 


4.9. The reverse and complement digraphs 


We take a break from studying hamps (Hamiltonian paths) in order to intro- 
duce two more operations on simple digraphs. 
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Definition 4.9.1. Let D = (V,A) bea simple digraph. Then: 
(a) The elements of (V x V) \ A will be called the non-arcs of D. 
(b) The reversal of a pair (i,j) € V x V means the pair (j,i). 
(c) We define D" as the simple digraph (V, A™’), where 
AR = {(j,i) | Gj) €A}. 


Thus, D*® is the digraph obtained from D by reversing each arc (i.e., 
swapping its source and its target). This is called the reversal of D. 


(d) We define D as the simple digraph (V, (V x V) \ A). This is the di- 
graph that has the same vertices as D, but whose arcs are precisely the 
non-arcs of D. This digraph D is called the complement of D. 


Example 4.9.2. Let 


Then, 


and D = (3 ef 4) 


Convention 4.9.3. In the following, the symbol # means “number”. For ex- 
ample, 


pey — 


(# of subsets of {1,2,3}) = 8. 


We now shall try to count hamps in simple digraph¢24, As a warmup, here 
is a particularly simple case: 


?6See [[17s-lec7] for a more detailed treatment of this topic. 
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Proposition 4.9.4. Let D be the simple digraph (V, A), where 
V = {1,2,...,n} for some n € N, 


and where 
A={Gj) |i<it- 
Then, (# of hamps of D) = 1. 


Proof. It is easy to see that the only hamp of D is (1,2,...,7). O 


The following is easy, too: 
Proposition 4.9.5. Let D be a simple digraph. Then, 
(# of hamps of D™®) = (# of hamps of D). 


Proof. The hamps of D" are obtained from the hamps of D by walking back- 
wards. O 


So far, so boring. What about this: 
Theorem 4.9.6 (Berge’s theorem). Let D be a simple digraph. Then, 


(# of hamps of D) = (# of hamps of D) mod 2. 


This is much less obvious or even expected. We first give an example: 


Example 4.9.7. Let D be the following digraph: 


This digraph has 3 hamps: (1,2,3) and (2,3,1) and (3,1,2). 
Its complement D looks as follows: 


It has only 1 hamp: (1,3,2). 
Thus, in this case, Theorem says that 1 = 3 mod 2. 
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Proof of Theorem (This is an outline; see proof of Theorem 1.3.6] 
for more details.) 

Write the simple digraph D as D = (V, A), and assume WLOG that V £ ©. 
Set n = |V]. 

A V-listing will mean a list of elements of V that contains each element of 
V exactly once. (Thus, each V-listing is an n-tuple, and there are n! many V- 
listings.) Note that a V-listing is the same as a hamp of the “complete” digraph 
(V,V x V). Any hamp of D or of D is therefore a V-listing, but not every 
V-listing is a hamp of D or D. 

Ifo = (01,02,...,0n) is a V-listing, then we define a set 


P (c) = {0102, 0203, ..+, Tn—10% | g 


We call this set P (o) the arc set of 7. When we regard g as a hamp of 
(V, V x V), this set P (q) is just the set of all arcs of ø. Note that this is an 
(n — 1)-element set. We make a few easy observations (prove them!): 


Observation 1: We can reconstruct a V-listing 7 from its arc set P (c). 
In other words, the map 0 ++ P (ø) is injective. 


Observation 2: Let g be a V-listing. Then, ø is a hamp of D if and 
only if P (o) C A. 


Observation 3: Let o be a V-listing. Then, o is a hamp of D if and 
only if P (e) C (V x V) \ A. 


Now, let N be the # of pairs (ø, B), where ø is a V-listing and B is a subset of 
A satisfying B C P(c). Thus, 


N= È N, 


7 is a V-listing 


where 
No = (# of subsets B of A satisfying B C P(c)). 


But we also have 
N= YN 
B is a subset of A 
where 
N? = (# of V-listings o satisfying B C P(c)). 
Let us now relate these two sums to hamps. We begin with 2 No. 
7 is a V-listing 

We shall use the Iverson bracket notation: i.e., the notation [A] for the truth 
value of a statement A. This truth value is defined to be the number 1 if A is 
true, and 0 if A is false. For instance, 


[2+2=4])=1 and [2+2=5] =0. 
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For any V-listing 7, we have 


No = (# of subsets B of A satisfying B C P (c)) 

= (# of subsets B of ANP (c)) 

— 2/ANP(?)| 

= [|ANP(c)| = 0] (since 2” = |m = 0] mod 2 for each m € N) 
( since equivalent statements have the ) 


MA) same truth value 
= [P (e) C (V x V) \A] (since P (e) is always a subset of V x V) 


= |ø is a hamp of D] mod 2 (by Observation 3). 
So 
N = D No 


is a V-listi = 
RA KRG = [o is a hamp of D| mod 2 
D g is a hamp of D] 


7 is a V-listing 


(# of V-listings c that are hamps of D) 


because Jo [7 is a hamp of D] is a sum 
7 is a V-listing 
of several 1’s and several 0’s, and the 1’s in this 
sum correspond precisely to 


the V-listings o that are hamps of D 
= (# of hamps of D) mod 2. 


What about the other expression for N ? Recall that 
N= NP, 


B is a subset of A 
where 
NB = (# of V-listings o satisfying B C P(c)). 
We want to prove that this sum equals (# of hamps of D), at least modulo 2. 

So let B be a subset of A. We want to know N? mod2. In other words, we 
want to know when N? is odd. 

Let us first assume that N? is odd, and see what follows from this. 

Since N? is odd, we have N? > 0. Thus, there exists at least one V-listing c 
satisfying B C P (7). We shall now draw some conclusions from this. 

First, a definition: A path cover of V means a set of paths in the “complete” 
digraph (V,V x V) such that each vertex v € V is contained in exactly one of 
these paths. The set of arcs of such a path cover is simply the set of all arcs of 
all its paths. For example, if V = {1,2,3,4,5,6,7}, then 


{(1,3,5), (2), (6), (7,4)} 
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is a path cover of V, and its set of arcs is {13, 35, 74}. 

Now, ponder the following: If we remove an arc 0;,0;41 from a path (v1, 02,...,Ux), 
then this path breaks up into two paths (v1, 02,...,0;) and (0;41,Vj42,---,Uk)- 
Thus, if we remove some arcs from the arc set P (e) of a V-listing ø, then we 
obtain the set of arcs of a path cover of V. (For instance, removing the arcs 
52, 26 and 67 from the arc set P (e) of the V-listing 7 = (1,3,5,2,6,7,4) yields 
precisely the path cover {(1,3,5), (2), (6), (7,4)} that we just showed as an 
example.) 

Now, recall that there exists at least one V-listing o satisfying B C P(c). 
Hence, B is obtained by removing some arcs from the arc set P (ø) of this V- 
listing 0. Therefore, B is the set of arcs of a path cover of V (by the claim of 
the preceding paragraph). Let us say that this path cover consists of exactly r 
paths. Then, 

(# of V-listings o satisfying B C P(c)) = r!, 


because any such V-listing o can be constructed by concatenating the r paths 
in our path cover in some order (and there are r! possible orders). 

Thus, N? = (# of V-listings c satisfying B C P(c)) = r!. But we have as- 
sumed that N? is odd. So r! is odd. Since r is positive (because V # Ø, so our 
path cover must contain at least one path), this entails that r = 1. So our path 
cover is just a single path; this path is a path of D (since its set of arcs B is a 
subset of A) and therefore is a hamp of D (since it constitutes a path cover of V 
all by itself). If we denote it by ø, then we have B = P (c) (since B is the set of 
arcs of the path cover that consists of 7 alone). 

Forget our assumption that N? is odd. We have thus shown that if N? is odd, 
then B = P(c) for some hamp © of D. 

Conversely, it is easy to see that if B = P (r) for some hamp © of D, then N? 
is odd (and actually equals 1). 

Combining these two results, we see that N? is odd if and only if B = P (c) 
for some hamp ¢ of D. Therefore, 


[NP is odd | = |B = P (c) for some hamp 7 of D]. 
However, 


Ne = |N" is odd| (since m = |m is odd] mod 2 for any m € Z) 
= |B = P (c) for some hamp 7 of D| mod 2. 


An introduction to graph theory, version August 2, 2023 page 130 


We have proved this congruence for every subset B of A. Thus, 


N= D w 
B is a subset of A — [B=P(c) for some hamp ¢ of D] mod 2 


(7 
= [B = P(c) for some hamp < of D] 
B is a subset of A 


= (# of subsets B of A such that B = P (a) for some hamp © of D) 
= (# of sets of the form P(c) for some hamp v of D) 


because each set of the form P (o) for some 
hamp co of D is a subset of A (by Observation 2) 


= (# of hamps of D) mod 2 


(indeed, Observation 1 shows that different hamps c have different sets P (c), 
so counting the sets P (ø) for all hamps o is equivalent to counting the hamps 
g themselves). 

Now we have proved that N = (# of hamps of D) mod 2 and 
N = (# of hamps of D) mod 2. Comparing these two congruences, we obtain 


(# of hamps of D) = (# of hamps of D) mod 2. 


This proves Berge’s theorem. o 


4.10. Tournaments 
4.10.1. Definition 
We now introduce a special class of simple digraphs. 


| Definition 4.10.1. A digraph D is said to be loopless if it has no loops. 


Definition 4.10.2. A tournament is defined to be a loopless simple digraph 
D that satisfies the 


¢ Tournament axiom: For any two distinct vertices u and v of D, exactly 
one of (u,v) and (v,u) is an arc of D. 


Example 4.10.3. The following digraph is a tournament: 
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The following digraph is a tournament as well: 


However, the following digraph is not a tournament: 


he 


because the tournament axiom is not satisfied for u = 1 and v = 3. Nor is 
the following digraph a tournament: 


> 


because the tournament axiom is not satisfied for u = 1 and v = 2. Finally, 
the digraph 


is not a tournament either, since it is not loopless. 
The digraph D in Proposition always is a tournament. 
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Example 4.10.4. Here is a tournament with 5 vertices: 


A tournament can also be viewed as a complete graph, whose each edge has 
been given a direction. 

Using Definition |4.9.1) we can restate the definition of a tournament as fol- 
lows: 


Proposition 4.10.5. Let D = (V, A) be a loopless simple digraph. Then, D is 
a tournament if and only if the non-loop arcs of D are precisely the arcs of 
D“. 


Proof. Easy consequence of definitions. O 


Exercise 4.11. Let D be a tournament with at least one vertex. 

We say that a vertex u of D directly owns a vertex w of D if (u,w) is an 
arc of D. 

We say that a vertex u of D indirectly owns a vertex w of D if there exists 
a vertex v of D such that both (u,v) and (v, w) are arcs of D. 

Prove that D has a vertex that (directly or indirectly) owns all other ver- 
tices. 


[Solution: This exercise appears in Exercise 6.3.1] (restated in the 
language of players and matches) and in [Maurer80, Theorem 1] (restated 
in the language of chickens and pecking orders). It originates in a study of 


pecking orders by Landau [Landau53].] 


4.10.2. The Rédei theorems 


Which tournaments have hamps? The answer is surprisingly simple? 


°7Here we agree to consider the empty list () to be a hamp of the digraph (Ø, Ø). 
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| Theorem 4.10.6 (Easy Rédei theorem). A tournament always has at least one 
hamp. 


Even better, and perhaps even more surprisingly: 
Theorem 4.10.7 (Hard Rédei theorem). Let D be a tournament. Then, 


(# of hamps of D) is odd. 


Our goal now is to prove these two theorems. Clearly, the Easy Rédei Theo- 
rem follows from the Hard one, since an odd number cannot be 0. Thus, it will 
suffice to prove the Hard one. 

The proof of the hard Rédei theorem will rely on the following crucial lemma: 


Lemma 4.10.8. Let D = (V, A) be a tournament, and let vw € A be an arc of 
D. 


Let D’ be the digraph obtained from D by reversing the arc vw. In other 


words, let 
D':= (V, (A\ {vw}) U {wo}). 


Then, D’ is again a tournament, and satisfies 


(# of hamps of D) = (# of hamps of D’) mod 2. 


Here is a visualization of the setup of Lemma {4.10.8} 


(Here, we are only showing the arcs joining v with w, since D and D’ agree in 
all other arcs.) 


Proof of Lemma[4.10.8) (This is an outline; see proof of Lemma 1.6.2] 
for more details.) 
First of all, D’ is clearly a tournament. It remains to prove the congruence. 
We introduce two more digraphs: Let 


Do := (the digraph D with the arc vw removed) and 
Dy := (the digraph D with the arc wv added). 
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Note that these are not tournaments any more. Here is a comparative illustra- 
tion of all four digraphs D, D’, Do and D2 (again showing only the arcs joining 
v with w, since there are no differences in the other arcs): 


The digraph Do is D’ with the arc wv removed. Therefore, a hamp of Do is 
the same as a hamp of D’ that does not use the arc wv. Hence, 
(# of hamps of Do) 
= (# of hamps of D’ that do not use the arc wv) 
= (# of hamps of D’) — (# of hamps of D’ that use the arc wv) . 


Similarly, since D is D} with the arc wv removed, we have 


(# of hamps of D) 

= (# of hamps of D2) — (# of hamps of D2 that use the arc wv) 

= (# of hamps of D2) — (# of hamps of D’ that use the arc wv) 
(the last equality is because a hamp of D» that uses the arc wv cannot use the 
arc vw, and therefore is automatically a hamp of D’ as well, and of course the 
converse is obviously true). 

However, from the previously proved equality 
(# of hamps of Do) 
= (# of hamps of D’) — (# of hamps of D’ that use the arc wv), 


we obtain 


(# of hamps of D’) 
= (# of hamps of Do) + (# of hamps of D’ that use the arc wv) 
= (# of hamps of Do) — (# of hamps of D’ that use the arc wv) mod 2 
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(since x +y = x — y mod 2 for any integers x and y). Thus, if we can show that 
(# of hamps of D2) = (# of hamps of Do) mod 2, 
then we will be able to conclude that 


(# of hamps of D) 

= (# of hamps of D2) — (# of hamps of D’ that use the arc wv) 
=(# of hamps of Dp) mod 2 

= (# of hamps of Do) — (# of hamps of D’ that use the arc wv) 

= (# of hamps of D’) mod 2, 


and the proof of the lemma will be complete. 

So let us show this. Recall that D is a tournament. Thus, the non-loop arcs 
of D are precisely the arcs of D" (by Proposition 4.10.5). Hence, the non-loop 
arcs of Do are precisely the arcs of D$ (since Dp is just D with the extra arc vw 
added, and since D5® is just D"SY with the extra arc vw added). Therefore, the 
digraphs Do and D5 are equal “up to loops” (i.e., they have the same vertices 
and the same non-loop arcs). Since loops don’t matter for hamps, these two 
digraphs thus have the same of hamps. Hence, 


(# of hamps in Do) = (# of hamps in D5’) = (# of hamps in D2) 
(by Proposition |4.9.5), and therefore 
(# of hamps in D2) = (# of hamps in Do) = (# of hamps in Do) mod 2 


(by Theorem [4.9.6). As explained above, this completes the proof of Lemma 
4108 B 


Now, the Hard Rédei theorem has become easy: 


Proof of Theorem (This is an outline; see proof of Theorem 1.6.1] 
for more details.) 

We need to prove that the # of hamps of D is odd. Lemma [4.10.8] tells us that 
the parity of this # does not change when we reverse a single arc of D. Thus, of 
course, if we reverse several arcs of D, then this parity does not change either. 
However, we can WLOG assume that the vertices of D are 1,2,...,n for some 
n € N, and then, by reversing the appropriate arcs, we can ensure that the arcs 
of D are 


12, 13, 14, ..., 1n, 
23, 24, peng 2N; 


a 


(n—1)n 
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(i.e., each arc of D has the form ij with i < j). But at this point, the tournament 
D has only one hamp: namely, (1,2,...,1). So (# of hamps of D) = 1 is odd 
at this point. Since the parity of the # of hamps of D has not changed as we 
reversed our arcs, we thus conclude that it has always been odd. This proves 
the Hard Rédei theorem (Theorem [4.10.7). oO 


As we already mentioned, the Easy Rédei theorem follows from the Hard 
Rédei theorem. But it also has a short self-contained proof ([17s-lec7, Theorem 
1.4.9]). 


Remark 4.10.9. Theorem [4.10.7| shows that the # of hamps in a tournament 
is an odd positive integer. Can it be any odd positive integer, or are certain 
odd positive integers impossible? 

Surprisingly, 7 and 21 are impossible. All other odd numbers between 1 
and 80555 are possible. For higher numbers, the answer is not known so far. 
See MathOverflow question #232751 ([MO232751]) for more details. 


4.10.3. Hamiltonian cycles in tournaments 


By the Easy Rédei theorem, every tournament has a hamp. But of course, not 
every tournament has a hami One obstruction is clear: 


| Proposition 4.10.10. If a digraph D has a hamc, then D is strongly connected. 


In general, this is only a necessary criterion for a hamc, not a sufficient one. 
Not every strongly connected digraph has a hamc. However, it turns out that 
for tournaments, it is also sufficient, as long as the tournament has enough 
vertices: 


| Theorem 4.10.11 (Camion’s theorem). If a tournament D is strongly con- 
nected and has at least two vertices, then D has a hamc. 


Proof sketch. A detailed proof can be found in Theorem 1.5.5]; here is 
just a very rough sketch. 

Let D = (V,A) be a strongly connected tournament with at least two ver- 
tices29| We must show that D has a hame. 

It is easy to see that D has a cycle. Let c = (v1, v2,..., Uk, v1) be a cycle of 
maximum length. We shall show that c is a hamc. 

Let C be the set {v1, v2, . . ., Ug} of all vertices of this cycle c. 

A vertex w € V \ C will be called a to-vertex if there exists an arc from some 
vi to w. 


28Recall that “hamc” is our shorthand for “Hamiltonian cycle”. 

2By the way, a tournament with exactly two vertices cannot be strongly connected (as it has 
only 1 arc). Thus, by requiring D to have at least two vertices, we have actually guaranteed 
that D has at least three vertices. 
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A vertex w € V \ C will be called a from-vertex if there exists an arc from w 
to some vj. 

Since D is a tournament, each vertex in V \ C is a to-vertex or a from-vertex. 
In theory, a vertex could be both (having an arc from some v; and also an arc 
to some other v;). However, this does not actually happen. To see why, argue 
as follows: 


e If a to-vertex_w has an arc from some v;, then it must also have an arc 
from Uj+1 (because otherwise there would be an arc from w to 0;+1, 
and then we could make our cycle c longer by interjecting w between v; 
and v;,1; but this would contradict the fact that c is a cycle of maximum 
length). 


e Iterating this argument, we see that if a to-vertex w has an arc from some 
vi, then it must also have an arc from v;,4, an arc from vj+2, an arc from 
vj+3, and so on; i.e., it must have an arc from each vertex of c. Conse- 
quently, w cannot be a from-vertex. This shows that a to-vertex cannot be 
a from-vertex. 


Let F be the set of all from-vertices, and let T be the set of all to-vertices. 
Then, as we have just shown, F and T are disjoint. Moreover, FUT = V \C. 
Since a to-vertex cannot be a from-vertex, we furthermore conclude that any to- 
vertex has an arc from each vertex of c (otherwise, it would be a from-vertex), 
and that any from-vertex has an arc to each vertex of c (otherwise, it would be 
a to-vertex). 

Next, we argue that there cannot be an arc from a to-vertex t to a from-vertex 
f. Indeed, if there was such an arc, then we could make the cycle c longer by 
interjecting t and f between (say) vı and v2. 

In total, we now know that every vertex of D belongs to one of the three 
disjoint sets C, F and T, and furthermore there is no arc from T to F, no arc 
from T to C, and no arc from C to F. Thus, there exists no walk from a vertex 
in T to a vertex in C (because there is no way out of T). This would contradict 
the fact that D is strongly connected, unless the set T is empty. Hence, T must 
be empty. Similarly, F must be empty. Since FUT = V \ C, this entails that 
V \ Cis empty, so that V = C. In other words, each vertex of D is on our cycle 
c. Therefore, c is a hamc. This proves Camion’s theorem. o 


4.10.4. Application of tournaments to the Vandermonde determinant 


To wrap up the topic of tournaments, let me briefly discuss a curious appli- 
cation of their theory: a combinatorial proof of the Vandermonde determinant 
formula. See for the many details I’ll be omitting. 

Recall the Vandermonde determinant formula: 


30Here, indices are periodic modulo k, so that v1 means 7. 
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Theorem 4.10.12 (Vandermonde determinant formula). Let x1, x2, ..., Xn be 
n numbers (or, more generally, elements of a commutative ring). Consider 
the n x n-matrix 


1 1 1 1 
X1 X2 X3 Xn 
2 2 2 2 
V -= xI x5 x3 2 as a) 
: J /1<i<n,1<j<n 
n—1 n—1 n—1 n—1 
X] X3 X3 Xn 


Then, its determinant is 


dettV= |] (x-xi). 


1<i<j<n 


There are many simple proofs of this theorem (e.g., a few on its ProofWiki 
page, which works with the transpose matrix). I will now outline a combina- 
torial one, using tournaments. This proof goes back to Ira Gessel’s 1979 paper 
[Gessel79]. 


First, how do det V and [] (xi — xj) relate to tournaments? 
1<i<j<n 
As a warmup, let’s assume that we have some number yj; j) given for each 
pair (i, j) of integers, and let’s expand the product 
(vaz + y(21)) (vas + Y(a)) (ves + ¥(32)) . 


The result is a sum of 8 products, one for each way to pluck an addend out of 
each of the three little sums: 


(vaz + y(21)) (vas) + Yea) (ves + Y2) 
= Y1,Y1,3)Y (23) F Y(1,2)Y(1,3)Y 8,2) F Y1,2Y(8,1)Y (2,3) + Y1,2)Y(8,1)Y (68,2) 
F Y(2,1)Y¥(1,3)Y (2,3) + Y(2,1)Y (1,3) ¥ (3,2) F Y(2,1)Y (3,1) 23) T Y(2,1)4(3,1)Y (3,2): 


Note that each of the 8 products obtained has the form yaypyc, where 
e ais one of the pairs (1,2) and (2,1), 
e b is one of the pairs (1,3) and (3,1), and 
e cis one of the pairs (2,3) and (3,2). 


We can view these pairs a, b and c as the arcs of a tournament with vertex 
set {1,2,3}. Thus, our above expansion can be rewritten more compactly as 
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follows: 


(vaz gg y(21)) (Yas) + via) (ves) + ¥(32)) 
= 2 II Yap 


D is a tournament (i,j) is an arc of D 
with vertex set {1,2,3} 


For reference, here are all the 8 tournaments with vertex set {1,2,3}: 


Here, for convenience, we are drawing an arc ij in blue if i < j and in red 
otherwise. 
This expansion can be generalized: We have 


II (ventos) = a Il Yap 


1<i<j<n _ Disa tournament (i,j) is an arc of D 
with vertex set {1,2,....1} 
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Xi, ifi < j; 
Substituting y(;j) = l É k : A f in this equality, we obtain 
J7 — 


X; ifi < j; 
IT Gais ) |] Pe o i 
J —x;, ifi> 

1<i<j<n D is a tournament (i,j) is an arc of D J! ZI 
with vertex set {1,2,....1} 


=(-1)# of red arcs of D) Il yes j 


‘ J 
j=l 
(where deg” j means the indegree of j in D, 
and where the “red arcs” are the arcs ij with i>/) 
n . 
__4)(# of red arcs of D) deg j 
(—1) TL ’- 


D is a tournament j=l 
with vertex set {1,2,....1} 


We shall refer to this sum as the “big sum”. 
On the other hand, if we let S, be the group of permutations of {1,2,...,n}, 
and if we denote the sign of a permutation c by sign, then we have 


n z 
det V = det (V7) = D signo- Ja” 
j=1 


TESn 


(by the definition of a determinant). We shall refer to this sum as the “small 
sum”. 
Our goal is to prove that the big sum equals the small sum. To prove this, we 


must verify the following: 


1. Each addend of the small sum is an addend of the big sum. Indeed, for 
each permutation 7 € Sņ, there is a certain tournament Tọ that has 


n n 

__4\ (# of red arcs of Ty) deg j _ a; ; o(j)—1 

(—1) Ma = signo ITs; 
—=s J= 


Can you find this T, ? 


2. All the addends of the big sum that are not addends of the small sum 
cancel each other out. Why? 


The basic idea is to argue that if a tournament D appears in the big sum 
but not in the small sum, then D has a 3-cycle (i.e., a cycle of length 


3). When we reverse such a 3-cycle (i.e., we reverse each of its arcs), the 


indegrees of all vertices are preserved, but the sign (—1)* fred arcs of D) ig 


flipped (since three arcs change their orientation). 


This suffices to show that for each addend that appears in the big sum but 
not in the small sum, there is another addend with the same magnitude 
but with opposite sign. Unfortunately, this in itself does not suffice to 
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ensure that all these addends cancel out; for example, the sum 1+ 1+1+ 


(—1) has the same property but does not equal 0. We need to show that 


the # of addends with positive sign (i.e. with (—1) of ted ares of D) _ 1) 


and a given magnitude equals the # of addends with negative sign (i.e., 
with (—1)(# of tedares of D) L _1) and the same magnitude. 


One way to achieve this would be by constructing a bijection (aka “perfect 
matching”) between the “positive” and the “negative” addends. This is 
tricky here: We would have to decide which 3-cycle to reverse (as there 
are usually many of them), and this has to be done in a bijective way 
(so that two “positive” addends don’t get assigned the same “negative” 
partner). 


A less direct, but easier way is the following: Fix a positive integer k, 
and consider only the tournaments with exactly k many 3-cycles. For 
each such tournament, we can reverse any of its k many 3-cycles. It can 
be shown (nice exercise!) that reversing the arcs of a 3-cycle does not 
change the # of all 3-cycles; thus, we don’t accidentally change our k in the 
process. Thus, we find a “k-to-k” correspondence between the “positive” 
addends of a given magnitude and the “negative” addends of the same 
magnitude. As one can easily see, this entails that the former and the 
latter are equinumerous, and thus really cancel out. The addends that 
remain are exactly those in the small sum. 


As already mentioned, this is only a rough summary of the proof; the details 


can be found in |17s-lec8]. 


4.11. Exercises on tournaments 


There is, of course, much more to say about tournaments. See for a 
selection of topics. Let us merely hint at some possible directions by giving a 
few exercises. 

The next three exercises use the notion of a “3-cycle”: 


Definition 4.11.1. A 3-cycle in a tournament D = (V,A) means a triple 
(u,v,w) of vertices in V such that all three pairs (u,v), (v,w) and (w,u) 
belong to A. 


For example, the tournament shown in Example /4.10.4/has the nine different 
3-cycles 


(1,4,3), (1,5,3), (2,5,3), (3,1,4), 
(3,1,5), (3,2,5), (4,3,1), (5,3,1), 
(5,3,2). 


(Yes, we are counting a 3-cycle (u,v,w) as being distinct from (v,w,u) and 
(w,u,v).) 
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Exercise 4.12. Let D = (V,A) be a tournament. Set n = |V| and m = 


p (a8, 0), 


vEV 2 


+ 
(a) Show that m= $ g -i 
vEV 2 


(b) Show that the number of 3-cycles in D is 3 ((3) = m) . 


[Solution: This is Exercise 5 on homework set #2 from my Spring 2017 
course; see the course page for solutions.] 


The next exercise uses the notation deg, v for the indegree of a vertex v ina 
digraph D. (We usually denote this by deg” v, but sometimes it is important to 
stress the dependence on D, since v can be a vertex of two different digraphs.) 


Exercise 4.13. If a tournament D has a 3-cycle (u,v, w), then we can define a 
new tournament D7, vw as follows: The vertices of D} vw shall be the same as 
those of D. The arcs of Dj, yy shall be the same as those of D, except that the 
three arcs (u,v), (v,w) and (w,u) are replaced by the three new arcs (v, u), 
(w,v) and (u,w). (Visually speaking, Di, is obtained from D by turning 
the arrows on the arcs (u,v), (v,w) and (w,u) around.) We say that the 
new tournament D’, vw is obtained from the old tournament D by a 3-cycle 
reversal operation. 

Now, let V be a finite set, and let E and F be two tournaments with vertex 
set V. Prove that F can be obtained from E by a sequence of 3-cycle reversal 
operations if and only if each v € V satisfies deg; (v) = deg, (v). (Note that 
a sequence may be empty, which allows handling the case E = F even if E 


has no 3-cycles to reverse.) 


[Solution: This is Exercise 6 on homework set #2 from my Spring 2017 


course; see the course page for solutions. ] 


Exercise 4.14. A tournament D = (V, A) is called transitive if it has no 3- 
cycles. 

If a tournament D = (V, A) has three distinct vertices u, v and w satisfying 
(u,v) € A and (v,w) € A, then we can define a new tournament D7, y as 
follows: The vertices of D} w shall be the same as those of D. The arcs of 
Dil vw Shall be the same as those of D, except that the two arcs (u,v) and 
(v,w) are replaced by the two new arcs (v,u) and (w,v). We say that the 
new tournament D! „w is obtained from the old tournament D by a 2-path 
reversal operation. 

Let D be any tournament. Prove that there is a sequence of 2-path reversal 
operations that transforms D into a transitive tournament. 
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| [Solution: This is Exercise 7 on homework set #2 from my Spring 2017 


course; see the course page for solutions.] 


5. Trees and arborescences 


Trees are particularly nice graphs. Among other things, they can be character- 
ized as 


e the minimal connected graphs on a given set of vertices, or 


e the maximal acyclic (= having no cycles) graphs on a given set of vertices, 
or 


e in many other ways. 


Arborescences are their closest analogue for digraphs. 

In this chapter, we will discuss the theory of trees and some of their ap- 
plications. Further applications are usually covered in courses in theoretical 
computer science, but their notion of a tree is somewhat different from ours. 


5.1. Some general properties of components and cycles 
5.1.1. Backtrack-free walks revisited 


Before we start with trees, let us recall and prove some more facts about general 
multigraphs. Recall the notion of a “backtrack-free walk” that already had a 
brief appearance in the proof of Theorem |2.10.7 


Definition 5.1.1. Let G be a multigraph. A backtrack-free walk of G means 
a walk w such that no two adjacent edges of w are identical. 


Here are a few properties of this notion: 


Proposition 5.1.2. Let G be a multigraph. Let w be a backtrack-free walk of 
G. Then, w either is a path or contains a cycle. 


Proof. We have already proved this for simple graphs (in Proposition [2.10.4). 
More or less the same argument works for multigraphs. (“More or less” be- 
cause the definition of a cycle in a multigraph is slightly different from that in 
a simple graph; but the proof is easy to adapt.) E 


Theorem 5.1.3. Let G be a multigraph. Let u and v be two vertices of G. 
Assume that there are two distinct backtrack-free walks from u to v in G. 
Then, G has a cycle. 


Proof. We have already proved this for simple graphs (Claim 1 in the proof of 
Theorem |2.10.7). More or less the same argument works for multigraphs. O 
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5.1.2. Counting components 


Next, we shall derive a few properties of the number of components of a graph. 
Again, we have already done most of the hard work, and we can now derive 
corollaries. First, we give this number a name: 


Definition 5.1.4. Let G be a multigraph. Then, conn G means the number of 
components of G. (Some authors also call this number bo (G). This notation 
comes from algebraic topology, where it stands for the 0-th Betti number. 
This makes sense, because we can regard a multigraph G as a topological 
space. But we won’t need this.) 


So a multigraph G satisfies conn G = 1 if and only if G is connected. More- 
over, conn G = 0 if and only if G has no vertices. 

Let us next recall Definition B.3.17]and Theorem B.3.18](which is an analogue 
of Theorem and can be proved in more or less the same way). As a 
consequence of the latter theorem, we obtain the following: 


Corollary 5.1.5. Let G be a multigraph. Let e be an edge of G. Then: 


(a) If e is an edge of some cycle of G, then conn (G \ e) = conn G. 
(b) If e appears in no cycle of G, then conn (G \ e) = connG +1. 


(c) In either case, we have conn (G \ e) < connG + 1. 


Proof. Part (a) follows from Theorem 3.3.18] (a). Part (b) follows from Theorem 
3.3.18} (b). Part (c) follows by combining parts (a) and (b). O 


| Corollary 5.1.6. Let G = (V, E, ọ) bea multigraph. Then, conn G > |V|— |E]. 


Proof. We induct on |E|: 

Base case: If |E| = 0, then connG = |V| (since |E| = 0 means that the graph 
G has no edges, and thus no two distinct vertices are path-connected); but this 
rewrites as conn G = |V] — |E| (since |E| = 0). Thus, Corollary [5.1.6] is proved 
for |E| = 0. 

Induction step: Let k € IN. Assume (as the induction hypothesis) that Corol- 
lary (5.1.6lholds for |E| = k. We must now show that it also holds for |E| = k +1. 

So let us consider a multigraph G = (V,E,¢) with |E| = k+1. Thus, |E| — 
1 =k. Pick any edge e € E (such an edge exists, since |E| = k+1 > 1 > 0). 
Then, the multigraph G \ e has edge set E \ {e} and therefore has |E \ {e}| = 
|E| — 1 = k many edges. Hence, by the induction hypothesis, we have 


conn (G \ e) > |V|—|E \ {e}| 
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(since G \ e is a multigraph with vertex set V and edge set E \ {e}). However, 
Corollary (c) yields conn (G \ e) < connG + 1. Thus, 
connG > conn (G \e)=1> |V|- |E \ {e}|-—1 = |V| — (E| — 1)—1 = |V| — |E]. 
—— —— 
2|V|—lE\fe}| =|E|-1 
This completes the induction step. Thus, Corollary is proven. oO 


Corollary 5.1.7. Let G = (V,E, ọ) be a multigraph that has no cycles. Then, 
conn G = |V| — |E]. 


Proof. Replay the proof of Corollary with just a few changes: Instead of 
applying Corollary (c), apply Corollary (b) (this is allowed because 
G has no cycles and thus e appears in no cycle of G). The induction hypothesis 
can be used because when G has no cycles, G \ e has no cycles either. All < and 
> signs in the above proof now can be replaced by = signs (since Corollary 
5.1.5] (b) claims an equality, not an inequality). The result is therefore conn G = 
|V| — |E]. O 


Corollary 5.1.8. Let G = (V, E, ọ) be a multigraph that has at least one cycle. 
Then, connG > |V|—|E| +1. 


Proof. Pick an edge e € E that belongs to some cycle (such an edge exists, since 
G has at least one cycle). Then, Corollary (a) yields conn (G \ e) = conn G. 
However, Corollary (applied to G \ e and E \ {e} instead of G and E) yields 


conn (G \ e) > |V|—|E\ {e}] = |V|- (IE| —1) = |V| —|E| +1. 
——— 
Seed 
Since conn (G \ e) = conn G, this rewrites as conn G > |V| — |E| +1. oO 


We summarize what we have proved into one convenient theorem: 


Theorem 5.1.9. Let G = (V,E, p) be a multigraph. Then: 


(a) We always have conn G > |V| — JE]. 
(b) We have connG = |V| — |E] if and only if G has no cycles. 


Proof. (a) This is Corollary 


(b) <=: This is Corollary [5.1.7] 

=>: Assume that connG = |V|— |E|. If G had any cycles, then Corollary 
[5.1.8] would yield connG > |V| — |E| +1 > |V|—|E|, which would contradict 
connG = |V|—|E|. So G has no cycles. This proves the “=>” direction of 
Theorem [5.1.9 O 
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Remark 5.1.10. Let G = (V, E, ọ) be a multigraph. Does the number 
conn G — (|V| — |E}) 


have anything to do with how many cycles G has? We know that it is 0 if G 
has no cycles. More generally, could it just be the number of cycles of G ? 
(Let’s say we count reversals and cyclic rotations of a cycle as being the same 
cycle.) 

Unfortunately, the answer is still no. For example, a complete graph Kn 


has many more than 1 — (r = (° many cycles. However, there is still 


2 
some subtler connection. The number conn G — (|V| — |E|) is known as the 
circuit rank or the cyclomatic number of G, and is the dimension of a certain 
vector space that, in some way, consists of cycles. 


5.2. Forests and trees 
5.2.1. Definitions 


We now introduce two of the heroes of this chapter: 
Definition 5.2.1. A forest is a multigraph with no cycles. 


(In particular, a forest therefore cannot contain two distinct parallel edges. 
It also cannot contain loops.) 


I Definition 5.2.2. A tree is a connected forest. 
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Example 5.2.3. Consider the following multigraphs: 


(Yes, G is an empty graph with no vertices.) Which of them are forests, and 
which are trees? 


e The graph A is not a forest, since it has a cycle (actually, several cycles). 
Thus, A is not a tree either. 


e The graph B is a tree. 
e The graph C is a forest, but not a tree, since it is not connected. 
e The graph D is a tree. 


e The graph E is a forest, but not a tree. 
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e The graph F is not a forest, since it has cycles. 


e The graph G (which has no vertices and no edges) is a forest, but not 
a tree, since it is not connected (recall: a graph is connected if it has 1 
component; but G has 0 components). 


e The graph H is a tree. 


5.2.2. The tree equivalence theorem 


Trees can be described in many ways: 


Theorem 5.2.4 (The tree equivalence theorem). Let G = (V, E, p) be a multi- 
graph. Then, the following eight statements are equivalent: 


e Statement T1: The multigraph G is a tree. 


e Statement T2: The multigraph G has no loops, and we have V £ Ø, 
and for each u € V and v € V, there is a unique path from u to v. 


e Statement T3: We have V # Ø, and for each u € V and v € V, there is 
a unique backtrack-free walk from u to v. 


e Statement T4: The multigraph G is connected, and we have |E| = 
|V|—1. 


e Statement T5: The multigraph G is connected, and we have |E| < |V]. 


e Statement T6: We have V # ©, and the graph G is a forest, but adding 
any new edge to G creates a cycle. 


e Statement T7: The multigraph G is connected, but removing any edge 
from G yields a disconnected (i.e., non-connected) graph. 


e Statement T8: The multigraph G is a forest, and we have |E| > |V| —1 
and V £ Ø. 
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Proof. We shall prove the following implications: 


In this digraph, an arc from Ti to Tj stands for the implication Ti =T}. Since 
this digraph is strongly connected (i.e., you can travel from Statement Ti to 
Statement Tj along its arcs for any i, j), this will prove the theorem. So let us 
prove the implications. 


Proof of T1I=+T3: Assume that Statement T1 holds. Thus, G is a tree. There- 
fore, G is connected, so that V # Ø. We must prove that for each u € V and 
v € V, there is a unique backtrack-free walk from u to v. The existence of such 
a walk is clear (since G is connected, so there is a path from u to v). Thus, we 
only need to show that it is unique. But this is easy: If there were two distinct 
backtrack-free walks from u to v (for some u € V and v € V), then Theorem 
would show that G has a cycle, and thus G could not be a forest, let alone 
a tree. Thus, the backtrack-free walk from u to v is unique. So we have proved 
Statement T3. The implication T1—+T3 is thus proved. 


Proof of T3=—>T?2: Assume that Statement T3 holds. We must prove that State- 
ment T2 holds. First, G has no loops, because if there was a loop e with end- 
point u, then the two walks (u) and (u,e,u) would be two distinct backtrack- 
free walks from u to u. It remains to prove that for each each u € V and 
v € V, there is a unique path from u to v. However, the existence of a walk 
from u to v always implies the existence of a path from u to v (by Corollary 
(3.3.10). Moreover, the uniqueness of a backtrack-free walk from u to v implies 
the uniqueness of a path from u to v (since any path is a backtrack-free walk). 
Thus, Statement T2 follows from Statement T3. 


Proof of T2—+T7: Assume that Statement T2 holds. Then, G is connected. 
Now, let us remove any edge e from G. Let u and v be the endpoints of e. Then, 
u # v (since G has no loops). There cannot be a path from u to v in the graph 
G \ e (because if there was such a path, then it would also be a path from u to v 
in the graph G, and this path would be distinct from the path (u,e, v); thus, the 
graph G would have at least two paths from u to v; but this would contradict 
the uniqueness part of Statement T2). Hence, the graph G \ e is disconnected. 
So we have shown that G is connected, but removing any edge from G yields a 
disconnected graph. In other words, Statement T7 holds. 
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Proof of T7=+T1: Assume that Statement T7 holds. We must show that G 
is a tree. Since G is connected (by Statement T7), it suffices to show that G 
is a forest, i.e., that G has no cycles. However, if G had any cycle, then we 
could pick any edge e of this cycle, and then we would know that G \ e is still 
connected (since Corollary (a) would yield conn (G \e) = connG = 1), 
and this would contradict Statement T7. Thus, G has no cycles, hence is a 
forest. This proves Statement T1. 


Proof of TI=+T6: Assume that Statement T1 holds. Thus, G is a tree. We 
must show that adding any new edge to G creates a cycle (since all other parts 
of Statement T6 are clear). 

Indeed, let us add a new edge f to G. Let u and v be the endpoints of f. The 
graph G is connected, so there is already a path from u to v in G. Combining 
this path with the edge f, we obtain a cycle. Thus, the graph obtained from G 
by adding the new edge f has a cycle. This completes our proof that Statement 
T6 holds. 


Proof of T6=—>T1: Assume that Statement T6 holds. Thus, G is a forest. We 
must only show that G is connected. 

Assume the contrary. Thus, there exist two vertices u and v of G that are not 
path-connected in G. Hence, adding a new edge f with endpoints u and v to 
the graph G cannot create a new cycle (because any such cycle would have to 
contain f (otherwise, it would already be a cycle of G, but G has no cycles), and 
then we could remove f from it to obtain a path from u to v in G; but such a 
path cannot exist, since u and v are not path-connected in G). This contradicts 
Statement T6. 

So we have shown that G is connected, and thus G is a tree. This proves 
Statement T1. 


Proof of T1=>T8: Assume that Statement T1 holds. So G is a tree. Clearly, G 
is then a forest. We must show that |E| > |V| — 1. 

Theorem (a) yields conn G > |V| — |E|. But we have conn G = 1 because 
G is connected. Thus, 1 = connG > |V| —|E|. In other words, |E| > |V| — 1. 
This proves Statement T8. 


Proof of T8=—>T1: Assume that Statement T8 holds. Thus, G is a forest. We 
must only show that G is connected. However, G is a forest, and thus has 
no cycles. Hence, Theorem (b) yields connG = |V|—|E| < 1 (since 
Statement 8 yields |E| > |V|—1). On the other hand, conn G > 1 (since V # Ø). 
Combining these two inequalities, we obtain conn G = 1. In other words, G is 
connected. This yields Statement T1 (since G is a forest). 


Proof of T1=+T4: Assume that Statement T1 holds. Then, G is a tree, hence a 
connected forest. Therefore, G has no cycles (by the definition of a forest). Theo- 
rem [5.1.9] (b) therefore yields conn G = |V| — |E|. Thus, |V| — |E| = conn G = 1 
(since G is connected), so that |E| = |V| — 1. Thus, Statement T4 is proved. 


Proof of T4=—T5: The implication T4=>TS5 is obvious. 


An introduction to graph theory, version August 2, 2023 page 151 


Proof of T5=>T1: Assume that Statement T5 holds. Thus, the multigraph G 
is connected, and we have |E| < |V|. Thus, |E| < |V|—1. In other words, 
1 < |V|—|E|. Since G is connected, we have conn G = 1 < |V| — |E|. However, 
Theorem 5.1.9] (a) yields conn G > |V| — |E|. Combining these two inequalities, 
we obtain connG = |V|— |E|. Thus, Theorem [5.1.9] (b) shows that G has no 
cycles. In other words, G is a forest. Hence, G is a tree (since G is connected). 
This proves Statement T1. 


We have now proved all necessary implications to conclude that all eight 
statements T1, T2, ..., T8 are equivalent. Theorem is thus proved. Oo 


We also observe the following connection between trees and forests: 


Proposition 5.2.5. Let G be a multigraph, and let C1, C2, . . ., Cg be its com- 
ponents. Then, G is a forest if and only if all the induced subgraphs 
G [C1], G [C2], - - -, G [Cy] are trees. 


Proof. =>: Assume that G is a forest. Thus, G has no cycles. Hence, the induced 
subgraphs G [C1], G [C2],...,G [Cz] have no cycles either (since a cycle in any 
of them would be a cycle of G); in other words, they are forests. But they are 
furthermore connected (since the induced subgraph on a component is always 
connected). Hence, they are connected forests, i.e., trees. 

<=: Assume that the induced subgraphs G [C1], G [C2],...,G[C,] are trees. 
Hence, none of them has a cycle. Thus, G has no cycles either (since a cycle of 
G would have to be fully contained in one of these induced subgraph’). In 
other words, G is a forest. Oo 


5.2.3. Summary 


Let us briefly summarize some properties of trees: 
If T = (V,E, ọ) is a tree, then... 


e T is a connected forest. (This is how trees were defined.) Thus, T has no 
cycles. (This is how forests were defined.) 


e we have |E| = |V|—1. (This follows from the implication T1= >T4 in 
Theorem |5.2.4}) 


e adding any new edge to T creates a cycle. (This follows from the implica- 
tion T1==+T6 in Theorem [5.2.4}) 


e removing any edge from T yields a disconnected (i.e., non-connected) 
graph. (This follows from the implication T1= +T7 in Theorem [5.2.4}) 


3lIndeed, if it wasn’t, then it would contain vertices from different components. But this is 
impossible, since there are no walks between vertices in different components. 
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e for each u € V and v € V, there is a unique backtrack-free walk from 
u to v. (This follows from the implication T1= +T3 in Theorem [5.2.4]) 
Moreover, this backtrack-free walk is a path (since any walk from u to v 
contains a path from u to v). 


Remark 5.2.6. Computer scientists use some notions of “trees” that are sim- 
ilar to ours, but not quite the same. In particular, their trees often have roots 
(i.e., one vertex is chosen to be called “the root” of the tree), which leads to 
a parent/child relationship on each edge (namely: the endpoint closer to the 
root is called the “parent” of the endpoint further away from the root). Of- 
ten, they also impose a total order on the children of each given vertex. With 
these extra data, a tree can be used for addressing objects, since each vertex 
has a unique “path description” from the root leading to it (e.g., “the second 
child of the fourth child of the root”). But this all is going too far afield for 
us here; we are mainly interested in trees as graphs, and won’t impose any 
extra structure unless we need it for something. 


Exercise 5.1. Let G be a multigraph that has no loops. Assume that there 
exists a vertex u of G such that 


for each vertex v of G, there is a unique path from u to v in G. 


Prove that G is a tree. 


[Remark: Pay attention to the quantifiers used here: duVv. This differs 
from the VuVv in Statement T2 of the tree equivalence theorem (Theorem 
5.2.4).] 


5.3. Leaves 


Continuing with our faux-botanical terminology, we define leaves in a tree: 


Definition 5.3.1. Let T be a tree. A vertex of T is said to be a leaf if its degree 
is 1. 


For example, the tree 


has three leaves: 1, 2 and 4. 
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How to find a tree with as many leaves as possible (for a given number of 
vertices)? For any n > 3, the simple graph 


({0,1,...,n—1}, {0i | i >0}) 


is a tree (when considered as a multigraph), and has n — 1 leaves (namely, all 
of 1,2,...,n — 1). This tree is called an n-star graph, as it looks as follows: 


(for n = 8). 


It is easy to see that no tree with n > 3 vertices can have more than n — 1 leaves, 
so the n-star graph is optimal in this sense. Note that for n = 2, the n-star graph 
has 2 leaves, not 1. 

How to find a tree with as few leaves as possible? For any n > 2, the n-path 


graph 


is a tree with only 2 leaves (viz., the vertices 1 and n). Can we find a tree with 
fewer leaves? For n = 1, yes, because the 1-path graph P; (this is simply the 
graph with 1 vertex and no edges) has no leaves at all. However, for n > 2, the 
n-path graph is the best we can do: 


Theorem 5.3.2. Let T be a tree with at least 2 vertices. Then: 


(a) The tree T has at least 2 leaves. 


(b) Let v be a vertex of T. Then, there exist two distinct leaves p and q of T 
such that v lies on the path from p to q. 


Note that I’m saying “the path” rather than “a path” here. This is allowed, 
because in a tree, for any two vertices p and q, there is a unique path from p 
to q. This follows from Statement T2 in the tree equivalence theorem (Theorem 
5.2.4). 


Proof of Theorem (b) We apply a variant of the “longest path trick”: Among 
all paths that contain the vertex v, let w be a longest one. Let p be the starting 
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point of w, and let q be the ending point of w. We shall show that p and q are 
two distinct leaves. 
[Here is a picture of w, for what it’s worth: 


Of course, the tree T can have other edges as well, not just those of w.] 

First, we observe that T is connected (since T is a tree), and has at least one 
vertex u distinct from v (since T has at least 2 vertices). Hence, T has a path r 
that connects v to u. This path r must contain at least one edge (since u Æ v). 
Thus, we have found a path r of T that contains v and contains at least one 
edge. Hence, the path w must contain at least one edge as well (since w is a 
longest path that contains v, and thus cannot be shorter than r). Since w is a 
path from p to q, we thus conclude that p 4 q (because if a path contains at 
least one edge, then its starting point is distinct from its ending point). 

Now, assume (for the sake of contradiction) that p is not a leaf. Then, deg p # 
1. The path w already contains one edge that contains p (namely, the first edge 
of w). Since deg p # 1, there must be another edge f of T that contains w. 
Consider this f. Let p’ be its endpoint distinct from p (if f is a loop, then we 
set p’ = p). Appending this edge f (and its endpoint) to the beginning of the 
path w, we obtain a backtrack-free walk 


/ 
Paf, Presis Caste 
—$ 
This is w 


(this is backtrack-free since f is not the first edge of w). According to Proposi- 
tion this backtrack-free walk either is a path or contains a cycle. Since T 
has no cycle (because T is a forest), we thus conclude that this backtrack-free 
walk is a path. It is furthermore a path that contains v and is longer than w 
(longer by 1, in fact). But this contradicts the fact that w is a longest path that 
contains v. This contradiction shows that our assumption (that p is not a leaf) 
was wrong. 

Hence, p is a leaf. A similar argument shows that q is a leaf (here, we need 
to append the new edge at the end of w rather than at the beginning). Thus, 
p and q are two distinct leaves of T (distinct because p 4 q) such that v lies on 
the path from p to q (since v lies on the path w, which is a path from p to g). 
This proves Theorem (b). 


(a) Pick any vertex v of T. Then, Theorem [5.3.2] (b) shows that there exist two 
distinct leaves p and q of T such that v lies on the path from p to g. Thus, in 
particular, there exist two distinct leaves p and q of T. In other words, T has at 
least two leaves. This proves Theorem |5.3.2] (a). 
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[Remark: Another way to prove part (a) is to write the tree T as T = (V,E,@), 
and recall the handshake lemma, which yields 


$} degv = 2-|E| = 2- (|V| — 1) (since |E| = |V| — 1 in a tree) 
vEV 
= 2. |V|- 2. 


Since each v € V satisfies degv > 1 (why?), this equality entails that at least 


two vertices v € V must satisfy degv < 1 (since otherwise, the sum }, degv 
vEV 
would be > 2- |V| — 1), and therefore these two vertices are leaves.] O 


Leaves are particularly helpful for performing induction on trees. The formal 
reason for this is the following theorem: 


Theorem 5.3.3 (induction principle for trees). Let T be a tree with at least 2 
vertices. Let v be a leaf of T. Let T \ v be the multigraph obtained from T 
by removing v and all edges that contain v (note that there is only one such 
edge, since v is a leaf). Then, T \ v is again a tree. 


Here is an example of a tree T and of the smaller tree T \ v obtained by 
removing a leaf v (namely, v = 3): 


Proof of Theorem [5.3.3] Write T as T = (V,E,@). Thus, T \ v is the induced 
subgraph T [V \ {v}. 

The graph T is a tree, thus a forest; hence, it has no cycles. Thus, the graph 
T \v has no cycles either. Hence, it is a forest. 

Furthermore, this forest T \ v has at least 1 vertex (since T has at least 2 
vertices). 

We shall now show that any two vertices p and q of T \ v are path-connected 
in T \ 2. 
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Indeed, let p and q be two vertices of T \ v. Then, p and q are path-connected 
in T (since T is connected). Hence, there exists a path w from p to q in T. 
Consider this path w. Note that v is neither the starting point nor the ending 
point of this path w (since p and q are vertices of T \ v, and thus distinct from 
v). Hence, if v was a vertex of w, then w would contain two distinct edges that 
contain v (namely, the edge just before v and the edge just after v). But this is 
impossible, since there is only one edge available that contains v (because v is 
a leaf). Thus, v cannot be a vertex of w. Hence, the path w does not use the 
vertex v, and thus is a path in the graph T \ v as well. So the vertices p and g 
are path-connected in T \ v. 

We have now shown that any two vertices p and q of T \ v are path-connected 
in T \ v. This shows that T \ v is connected (since T \ v has at least 1 vertex). 
Hence, T \ v is a tree (since T \ v is a forest). oO 


Theorem has a converse as well: 


Theorem 5.3.4. Let G be a multigraph. Let v be a vertex of G such that 
deg v = 1 and such that G \v is a tree. (Here, G \ v means the multigraph 
obtained from G by removing the vertex v and all edges that contain v.) 
Then, G is a tree. 


Proof. Left to the reader. (The main step is to show that a cycle of G cannot 
contain v.) oO 


Theorem b.3.3]helps prove many properties of trees by induction on the num- 
ber of vertices. In the induction step, remove a leaf v and apply the induction 
hypothesis to T \ v. 

The following exercise is essentially a generalization of Theorem 5.3.2] (a): 


Exercise 5.2. Let T be a tree. Let w be any vertex of T. Prove that T has at 
least deg w many leaves. 
Exercise 5.3. A dominating set of a multigraph G is defined to be a dominat- 
ing set of its underlying simple graph G5®™P., 
Let G be a forest. Prove that 


(—1)/7! = +1. 


D is a dominating set of G 


Exercise 5.4. Let T be a tree having more than 1 vertex. Let L be the set of 
leaves of T. Prove that it is possible to add |L| — 1 new edges to T in such a 
way that the resulting multigraph has a Hamiltonian cycle.[Solution: This is 


Exercise 4 on homework set #3 from my Spring 2017 course; see the course 
page for solutions. ] 
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5.4. Spanning trees 
5.4.1. Spanning subgraphs 


We now proceed to a crucial application of trees. First we define a concept that 
makes sense for any multigraphs: 


Definition 5.4.1. A spanning subgraph of a multigraph G = (V, E, p) means 
a multigraph of the form (V,F, 9 |r), where F is a subset of E. 

In other words, it means a submultigraph of G with the same vertex set as 
G. 

In other words, it means a multigraph obtained from G by removing some 
edges, but leaving all vertices undisturbed. 


Compare this to the notion of an induced subgraph: 


e To build an induced subgraph, we throw away some vertices but keep all 
the edges that we can keep. (As usual in mathematics, the words “some 
vertices” include “no vertices” and “all vertices”.) 


e In contrast, to build a spanning subgraph, we keep all vertices but throw 
away some edges. 


5.4.2. Spanning trees 


Spanning subgraphs are particularly useful when they are trees: 


Definition 5.4.2. A spanning tree of a multigraph G means a spanning sub- 
graph of G that is a tree. 


Example 5.4.3. Let G be the following multigraph: 
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Here is a spanning tree of G: 


Here is another: 


(Yes, this is a different one, because a + p.) And here is yet another spanning 
tree of G: 


Example 5.4.4. Let n be a positive integer. Consider the cycle graph Cy. (We 
defined this graph C, in Definition 2.6.3] for all n > 2, but we later redefined 
Cz and defined C4 in Definition |3.3.5| Here, we are using the latter modified 
definition.) 

The graph Cn has exactly n spanning trees. Indeed, any graph obtained 
from C, by removing a single edge is a spanning tree of Cy. 


Proof. A tree with n vertices must have exactly n — 1 edges (by the implication 
T1—+T4 in Theorem 9.2.4). Thus, a spanning subgraph of C, can be a tree only 
if it has n — 1 edges, i.e., only if it is obtained from C, by removing a single edge 
(since C, has n edges in total). Thus, C, has at most n spanning trees (since 
Cn has n edges that can be removed). It remains to check that any subgraph 
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obtained from C, by removing a single edge is indeed a spanning tree. But 
this is easy, since all such subgraphs are isomorphic to the path graph P,,. This 
proves Example O 


Exercise 5.5. Fix m > 1. Let G be the simple graph with 3m + 2 vertices 
a, b, X1, X2,- yp Xm, Y1 Y2, - +) Wins 21,227 +++ 2m 
and the following 3m + 3 edges: 


AX1, AY1, 4Z1, 
XiXi+1, YiYi+1, ZiZi+1 for alli € {1,2,...,m— 1}, 


Xmb, Ymb, Zmb. 


(Thus, the graph consists of two vertices a and b connected by three paths, 
each of length m + 1, with no overlaps between the paths except for their 
starting and ending points. Here is a picture for m = 3: 


) Compute the number of spanning trees of G. 
[To argue why your number is correct, a sketch of the argument in 1-2 
sentences should be enough; a fully rigorous proof is not required.] 


[Solution: This is Exercise 2 (c) on homework set #3 from my Spring 2017 


course; see the course page for solutions.] 


5.4.3. Spanning forests 


A spanning tree of a graph G can be regarded as a minimum “backbone” of G 
— that is, a way to keep G connected using as few edges as possible. Of course, 
if G is not connected, then this is not possible at all, so G has no spanning trees 
in this case. The best one can hope for is a spanning subgraph that keeps each 
component of G connected using as few edges as possible. This is known as a 
“spanning forest”: 
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Definition 5.4.5. A spanning forest of a multigraph G means a spanning 
subgraph H of G that is a forest and satisfies conn H = connG. 


When G is a connected multigraph, a spanning forest of G means the same 
as a Spanning tree of G. 


5.4.4. Existence and construction of a spanning tree 


The following theorem is crucial, which is why we will outline four different 
proofs: 


| Theorem 5.4.6. Each connected multigraph G has at least one spanning tree. 


First proof. Let G be a connected multigraph. We want to construct a spanning 
tree of G. We try to achieve this by removing edges from G one by one, until 
G becomes a tree. When doing so, we must be careful not to disconnect the 
graph (i.e., not to destroy its connectedness). According to Theorem B.3.18} this 
can be achieved by making sure that we never remove a bridge (i.e., an edge 
that appears in no cycle). Thus, we keep removing non-bridges (i.e., edges that 
are not bridges) as long as we can (i.e., until we end up with a graph in which 
every edge is a bridge). 

So here is the algorithm: We start with G, and we successively remove non- 
bridges one by one until we no longer have any non-bridges lef] This pro- 
cedure cannot go on forever, since G has only finitely many edges. Thus, after 
finitely many steps, we will end up with a graph that has no non-bridges any 
more. This resulting graph therefore has no cycles (since any cycle would have 
at least one edge, and this edge would be a non-bridge), but is still connected 
(since G was connected, and we never lost connectedness as we removed only 
non-bridges). Thus, this resulting graph is a tree. Since it is also a spanning 
subgraph of G (by construction), it is therefore a spanning tree of G. This proves 
Theorem [5.4.6 O 


Second proof (sketched). In the above first proof, we constructed a spanning tree 
of G by starting with G and successively removing edges until we got a tree. 
Now let us take the opposite strategy: Start with an empty graph on the same 
vertex set as G, and successively add edges (from G) until we get a connected 
graph. 

Here are some details: We start with a graph L that has the same vertex set 
as G, but has no edges. Now, we inspect all edges e of G one by one (in some 
order). For each such edge e, we add it to L, but only if it does not create 
a cycle in L; otherwise, we discard this edge. Notice that adding an edge e 


32Warning: We cannot remove several non-bridges at once! We have to remove them one by 
one. Indeed, if e and f are two non-bridges of G, then there is no guarantee that f remains a 
non-bridge in G \ e. So we cannot remove both e and f simultaneously; we have to remove 
one of them and check whether the other is still a non-bridge. 
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with endpoints u and v to L creates a cycle if and only if u and v lie in the 
same component of L (before we add e). Thus, we only add an edge to L if 
its endpoints lie in different components of L; otherwise, we discard it. This 
way, at the end of the procedure, our graph L will still have no cycles (since we 
never create any cycles). In other words, it will be a forest. 

Let me denote this forest by H. (Thus, H is the L at the end of the procedure.) 
I claim that this forest H is a spanning tree of G. Why? Since we know that H is 
a forest, we only need to show that H is connected. Assume the contrary. Thus, 
there is at least one edge e of G whose endpoints lie in different components of 
H (why?). This edge e is therefore not an edge of H. Therefore, at some point 
during our construction of H, we must have discarded this edge e (instead of 
adding it to L). As we know, this means that the endpoints of e used to lie 
in the same component of L at the point at which we discarded e. But this 
entails that these two endpoints lie in the same component of L at the end of 
the procedure as well (because the graph L never loses any edges during the 
procedure, so that any two vertices that used to lie in the same component of 
L at some point will still lie in the same component of L ever after). In other 
words, the endpoints of e lie in the same component of H. This contradicts 
our assumption that the endpoints of e lie in different components of H. This 
contradiction completes our proof that H is connected. Hence, H is a spanning 
tree of G, and we have proved Theorem [5.4.6] again. oO 


Third proof. This proof takes yet another approach to constructing a spanning 
tree of G: We choose an arbitrary vertex r of G, and then progressively “spread 
a rumor” from r. The rumor starts at vertex r. On day 0, only r has heard 
the rumor. Every day, every vertex that knows the rumor spreads it to all its 
neighbors (i.e., all vertices adjacent to it). Since G is connected, the rumor 
will eventually spread to every vertex of G. Now, each vertex v (other than r) 
remembers which other vertex v’ it has first heard the rumor from (if it heard 
it from several vertices at the same time, it just picks one of them), and picks 
some edge ey that has endpoints v and v’ (such an edge must exist, since v must 
have heard the rumor from a neighbor). The edges e, for all v € V \ {r} (where 
V is the vertex set of G) then form a spanning tree of G (that is, the graph with 
vertex set V and edge set {e, | v € V \ {r}} is a spanning tree). Why? 

Intuitively, this is quite convincing: This graph cannot have cycles (because 
that would require a time loop) and must be connected (because for any ver- 
tex v, we can trace back the path of the rumor from r to v by following the 
edges ey backwards). To obtain a rigorous proof, we formalize this construction 
mathematically: 

Write G as G = (V, E, ọ). Choose any vertex r of G. 

We shall recursively construct a sequence of subgraphs 


(Vo, Eo, po) , (Vi, E, 1) , (Vo, Ep, (2) , 


of G. The idea behind these subgraphs is that for each i € N, the set V; will 
consist of all vertices v that have heard the rumor by day i, and the set E; will 
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consist of the corresponding edges ey. The map g; will be the restriction of @ 
to E;, of course. 
Here is the exact construction of this sequence of subgraphs: 


e Recursion base: Set Vo := {r} and Ep := Ø. Let po be the restriction of p to 
the (empty) set Eo. 


e Recursion step: Let i € IN. Assume that the subgraph (Vj, E;, p;i) of G has 
already been defined. Now, we set 


Visi := V; U {v € V | vis adjacent to some vertex in V;}. 


For each v € Vj+, \ Vi, we choose one edge ex that join?) v to a vertex in 
V; (such an edge exists, since v € Vj+1; if there are several, we just choose 
a random one). Set 


Bai := E; U {ey | v E€ Vii \ Vj}. 


Finally, we let gj41 be the restriction of the map ọ to the set E;}1. This is 
a map from E;;; to P12 (Vj41) (because any edge ey with v € V;41 \ V; has 
one endpoint v in V;,1 \ V; C Vj;41 and the other endpoint in V; C V;41). 
Thus, (Vj41, £41, Qi+1) is a well-defined subgraph of G. 


This construction yields that (Vj, E;, pi) is a subgraph of (Vj+1, Ei+1, 9i41) for 
each i € IN. Hence, Vo C Vi C Vz C ---, so that |Vo| < |Vi| < |V2| < ---. Since 
a sequence of integers bounded from above cannot keep increasing forever (and 
the sizes |V;| are bounded from above by |V|, since each V; is a subset of V), we 
thus see that there exists some i € IN such that |V;| = |V;+1|. Consider this i. 
From |V;| = |Vji1|, we obtain V; = V;+1 (since V; C Vj41). 

In our colloquial model above, V; = Vj; means that no new vertices learn 
the rumor on day i + 1; it is reasonable to expect that at this point, every vertex 
has heard the rumor. In other words, we claim that V; = V. A rigorous proof 
of this can be easily given using the fact that G is connected 

Now, we claim that the subgraph (Vj, E;, pgi) is a spanning tree of G. To see 
this, we must show that this subgraph is a forest and is connected (since V; = V 
already shows that it is a spanning subgraph). Before we do this, let us give an 
example: 


33We say that an edge joins a vertex p to a vertex q if the endpoints of this edge are p and q. 

34Here is the proof in detail: We must show that V; = V. Assume the contrary. Thus, there 
exists a vertex u € V \ V;. Consider this u. The path from r to u starts at a vertex in V; 
(since r € Vo C V;) and ends at a vertex in V \ V; (since u € V \ V;). Thus, it must cross over 
from V; into V \ V; at some point. Therefore, there exists an edge with one endpoint in V; 
and the other endpoint in V \ V;. Let v and w be these two endpoints, so that v € V; and 
w € V \ Vj. Then, w is adjacent to some vertex in V; (namely, to v), and therefore belongs to 
Vi+1 (by the definition of Vj,1). Hence, w € V;}1 = Vj. But this contradicts w ¢ V \ Vj. This 
contradiction shows that our assumption was wrong, ged. 
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Example 5.4.7. Let G be the following multigraph: 


Set r = 3. Then, the above construction yields 


Vo = {3}, 

Vi = {3, 1,4}, 

Vo = {3, 1,4,2,5,6, 10}, 

V3 = {3, 1,4,2,5,6, 10,8,9,7} = V, 


so that V, = V for all k > 3. Thus, we can take i = 3. Here is an image of the 
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Vk as progressively growing circles: 


(The dark-red inner circle is Vo; the red circle is Vj; the orange circle is V3; 
the yellow circle is V3 = V4 = V5 = --- = V.) Finally, the edges e, can be 
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chosen to be the following (we are painting them red for clarity): 


(Here, we have made two choices: We chose e2 to be the edge joining 2 with 1 
rather than the edge joining 2 with 4, and we chose e7 to be the edge joining 
7 with 6 rather than 7 with 5. The other options would have been equally 
fine.) 


We now return to the general proof. Let us first show the following: 


Claim 1: Let j € IN. Each vertex of the graph (Vj, Ej, pj) is path- 
connected to r in this graph. 


[Proof of Claim 1: We induct on j: 

Base case: For j = 0, Claim 1 is obvious, since Vp = {r} (so the only vertex of 
the graph in question is r itself). 

Induction step: Fix some positive integer k. Assume (as the induction hy- 
pothesis) that Claim 1 holds for j = k—1. That is, each vertex of the graph 
(Vi_-1, Ex—1, Pk_1) is path-connected to r in this graph. 

Now, let v be a vertex of the graph (Vp, Ex, p,). We must show that v is 
path-connected to r in this graph. If v € V;_1, then this follows from the in- 
duction hypothesis (since (V_1, Ex_1, px_1) is a subgraph of (Vy, Ex, #x)). Thus, 
we WLOG assume that v ¢ V;_1 from now on. Hence, v € Vy \ Vi_1. Accord- 
ing to the recursive definition of Ex, this entails that there is an edge ey € Ex 
that joins v to some vertex u € Vi_1. Consider this latter vertex u. Then, v 
is path-connected to u in the graph (Vp, Ex, py) (since the edge e, provides a 
length-1 path from v to u). However, u is path-connected to r in the graph 
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(Vi-1, Ex—1, Px_1) (by the induction hypothesis, since u € V,_1), hence also 
in the graph (Vp, Ex, pg) (since (Vk—1, Ex_1, Pg_1) is a subgraph of (Vp, Ex, Px)). 
Since the relation “path-connected” is transitive, we conclude from the previous 
two sentences that v is path-connected to r in the graph (Vj, Ex, px). 

So we have shown that each vertex v of the graph (Vy, Ex, px) is path-connected 
to rin the graph (V;, Ex, pp). In other words, Claim 1 holds for j = k. This com- 
pletes the induction step, and Claim 1 is proved.] 


Claim 1 (applied to j = i) shows that each vertex of the graph (Vj, E;, pi) is 
path-connected to r in this graph. Since the relation “path-connected” is an 
equivalence relation, this entails that any two vertices of this graph are path- 
connected. Thus, the graph (Vj, E;, pi) is connected (since it has at least one 
vertex). It remains to prove that this graph (Vj, E;, g;) is a forest. 

Again, we do this using an auxiliary claim: 


Claim 2: Let j € N. Then, the graph (Vj, Ej, pj) has no cycles. 


[Proof of Claim 2: We induct on j: 

Base case: The graph (Vo, Eo, go) has no edges (because Eg = @) and thus no 
cycles. Thus, Claim 2 holds for j = 0. 

Induction step: Fix some positive integer k. Assume (as the induction hypoth- 
esis) that Claim 2 holds for j = k — 1. That is, the graph (V,_1, Ex_1, pg_1) has 
no cycles. 

Now, let C be a cycle of the graph (Vi, Ex, gx). Then, C must use at least 
one edge from Ep \ Ex_; (since otherwise, C would be a cycle of the graph 
(Vi-1, Ex—1, Pk_1), but this is impossible, since (Vx_1, Ex_1, px_1) has no cycles). 
However, each edge from Ex \ E,_; has the form e, for some v € Vy \ V_1 
(because of how Eg was defined). Thus, C must have an edge of this form. 
Consider the corresponding vertex v € V; \ Vi_1. The cycle C contains the edge 
€y and therefore also contains its endpoint v. However, (again by the definition 
of Ez) the edge ex is the only edge in Ex that contains the vertex v. Thus, the 
vertex v cannot be contained in any cycle of (Vi, Ex, p,) (because a cycle would 
necessarily include two distinct edges that contain v). This contradicts the fact 
that the cycle C contains v. 

Forget that we fixed C. We thus have obtained a contradiction for each cycle 
C of the graph (V;, Ex, px). Hence, the graph (Vy, Ex, py.) has no cycles. In other 
words, Claim 2 holds for j = k. This completes the induction step, and Claim 2 
is proved.] 

Applying Claim 2 to j = i, we see that the graph (Vj, E;, pi) has no cycles. In 
other words, this graph is a forest. Since it is connected, it is therefore a tree. 
Since it is a spanning subgraph of G, we thus conclude that it is a spanning tree 
of G. Hence, we have constructed a spanning tree of G. 


We note an important property of this construction: 
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Claim 3: For each k € N, we have 
Y= {veV | d(r,0) <k}, 
where d (r,v) means the length of a shortest path from r to v. 


This is easily proved by induction on k. Thus, the spanning tree (Vj, E;, 9;) 
we have constructed has the following property: For each v € V, the path from 
r to v in this spanning tree is a shortest path from r to v in G. For this reason, 
this spanning tree is called a breadth-first search (“BFS”) tree. Note that the 
choice of root r is important here: It is usually not true that the path from an 
arbitrary vertex u to an arbitrary vertex v along our spanning tree is a shortest 
path in G. No spanning tree of G has this property, unless G itself is “more or 
less a tree” (more precisely, unless G*i™P is a tree)! O 


Fourth proof of Theorem|5.4.6] (sketched). We imagine a snake that slithers along 
the edges of G, trying to eventually bite each vertex. It starts at some vertex r, 
which it immediately bites. Any time the snake enters a vertex v, it makes the 
following step: 


e If some neighbor of v has not been bitten yet, then the snake picks such 
a neighbor w as well as some edge f that joins w with v; the snake then 
moves to w along the edge f, bites the vertex w and marks the edge f. 


e If not, then the snake marks the vertex v as fully digested and backtracks 
(along the marked edges) to the last vertex it has visited but not fully 
digested yet. 


Once backtracking is no longer possible (because there are no more vertices 
left that are not fully digested), the procedure is finished. I claim that the 
marked edges at that moment are the edges of a spanning tree of G. 

I won't prove this claim in detail, but I will give some hints. First, however, 
an example: 
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Example 5.4.8. Let G be the following connected multigraph: 


Let our snake start its journey at r = 3. It bites this vertex. Then, let’s say 
that it picks the vertex 1 as its next victim (it could just as well go to 4 or 7; 
the snake has many choices, but we follow one possible trip). Thus, it next 
arrives at vertex 1, bites it and marks the edge that brought it to this vertex. 
As its next destination, it necessarily picks the vertex 2 (since vertex 3 has 
already been bitten). It moves to vertex 2, bites it and marks the edge. Next, 
let’s say that it picks the vertex 4 (the other option would be 8). It thus moves 
to 4, bites it and marks the edge. Proceeding likewise, it then moves to 5 (the 
other options are 6 and 10; the vertices 2 and 3 do not qualify since they are 
already bitten), bites 5 and marks an edge. From there, let’s say it moves to 
8, bites 8 and marks an edge. Now, there is no longer an unbitten neighbor 
of 8 to move to. Thus, the snake marks the vertex 8 as fully digested and 
backtracks to the last vertex not fully digested — which, at this point, is 5. 
From this vertex 5, it moves on to 9 (this is the only option, since 4 and 8 
have already been bitten). And so on. Here is one possible outcome of this 
journey (there are a few more decisions that the snake can make here, so you 
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may get a different one): 


Here, the marked edges are drawn in bold red ink, and endowed with an 
arrow that represents the direction in which they were first used (e.g., the 
edge joining 2 with 4 has an arrow towards 4 because it was first used to get 
from 2 to 4). 


Now, as promised, let me outline a proof of the above claim (that the marked 
edges form a spanning tree of G). To wit, argue the following four observations 
(ideally in this order): 


1. After each step, the marked edges are precisely the edges along which the 
snake has moved so far. 


2. After each step, the network of bitten vertices and marked edges is a tree. 
3. After enough steps, each bitten vertex is fully digested. 


4. At that point, the network of bitten vertices and marked edges is a span- 
ning tree (since each neighbor of a fully digested vertex is bitten, thus 
fully digested by observation 3). 


Details are left to the reader. 

The result is that Theorem is proved once again. However, more comes 
out of the above construction if you know where to look. The spanning tree 
T of G whose edges are the edges marked by the snake is called a depth-first 
search (“DFS”) tree. It has the following extra property: If u and v are two 
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adjacent vertices of G, then either u lies on the path from r to v in T, or v lies on 
the path from r to u in T. (This called a “lineal spanning tree”. See [BenWil06 
§6.1] for details.) O 


5.4.5. Applications 


Spanning trees have lots of applications: 


e A spanning tree of a graph can be viewed as a kind of “backbone” of 
the graph, which in particular provides “canonical” paths between any 
two vertices. This is useful, e.g., for networking applications where hav- 
ing a choice between different paths would be problematic (see, e.g., the 
Spanning Tree Protocol). 


e A w-minimum spanning tree (see Exercise 5.8]= Homework set #5 exercise 
6) solves a global version of the cheapest-path problem. It can also be used 
for detecting clusters. 


e Depth-first search (the algorithm used in our fourth proof of Theorem 
can also be used as a way to traverse all vertices of a given graph 
and return back to the starting point. In particular, this provides an al- 
gorithmic way to solve mazes (since a maze can be modeled as a graph, 
where the vertices correspond to “rooms” and the edges correspond to 
“doors”). This appears to have been the original motivation for Trémaux 
to invent depth-first search back in the 19th century. 


Here is a more theoretical application of spanning trees: 


Definition 5.4.9. A vertex v of a connected multigraph G is said to be a cut- 
vertex if the graph G \ v is disconnected. (Recall that G \ v is the multigraph 
obtained from G by removing the vertex v and all edges that contain v.) 


| Proposition 5.4.10. Let G be a connected multigraph with > 2 vertices. Then, 
there are at least 2 vertices of G that are not cut-vertices. 


Proof. Pick a spanning tree T of G (we know from Theorem that such a 
spanning tree exists). Then, T has at least 2 leaves (by Theorem (a)). But 
each leaf of T is a non-cut-vertex of G (why?). O 


Remark 5.4.11. It is not true that conversely, any non-leaf of T is a cut-vertex 
of G. So we cannot get any lower bound on the number of cut-vertices. 
And this is not surprising: Lots of graphs (e.g., the complete graph K, for 
n > 2) have no cut-vertices at all. These graphs are said to be 2-connected, 
and their properties have been amply studied (see, e.g., §4.2] for an 
introduction). 
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5.4.6. Exercises 


Exercise 5.6. Let G be a connected multigraph. Let Tı and T be two spanning 
trees of G. 
Prove the following?) 


(a) For any e € E (T1) \ E (T2), there exists an f € E (T2) \ E (T1) with the 
property that replacing e by f in T; (that is, removing the edge e from 
Tı and adding the edge f) results in a spanning tree of G. 


(b) For any f € E (T2) \ E (T4), there exists an e € E (T1) \ E (T2), with the 
property that replacing e by f in T; (that is, removing the edge e from 
Tı and adding the edge f) results in a spanning tree of G. 


[Hint: The two parts look very similar, but (to my knowledge) their proofs 
are not.] 


Exercise 5.7. Let G be a connected multigraph. Let S be the simple graph 
whose vertices are the spanning trees of G, and whose edges are defined as 
follows: Two spanning trees Tı and T of G are adjacent (as vertices of S) 
if and only if T> can be obtained from Tı by removing an edge and adding 
another (i.e., if and only if there exist an edge e1 of T; and an edge e2 of T2 
such that e2  e; and T> \ e2 = T; \ e1). 

Prove that the simple graph S is itself connected. (In simpler language: 
Prove that any spanning tree of G can be transformed into any other span- 
ning tree of G by a sequence of legal “remove an edge and add another” 
operations, where such an operation is called legal if its result is a spanning 
tree of G.) 


[Example: If G is the multigraph 


G) 
ENS 


35Recall that E (H) denotes the edge set of any graph H. 
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then the graph S looks as follows: 


] 


Exercise 5.8. Let G = (V, E, pọ) be a connected multigraph. Let w : E + R be 
a map that assigns a real number w (e) to each edge e. We shall call this real 
number w (e) the weight of the edge e. 
If H = (W,F,ọ |f) is a subgraph of G, then the weight w (H) of H is 
defined to be }, w (f) (that is, the sum of the weights of all edges of H). 
JEE 


A w-minimum spanning tree of G means a spanning tree of G that has 
the smallest weight among all spanning trees of G. 

In our first proof of Theorem we have seen a way to construct a 
spanning tree of G by successively removing non-bridges until only bridges 
remain. (A non-bridge means an edge that is not a bridge.) 

Now, let us perform this algorithm, but taking care to choose a non-bridge 
of largest weight (among all non-bridges) at each step. Prove that the result 
will be a w-minimum spanning tree. 
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Exercise 5.9. Let G be a connected multigraph with an even number of ver- 
tices. Prove that there exists a spanning subgraph H of G such that each 
vertex of H has odd degree (in H). 


[Hint: One way to solve this begins by reducing the problem to the case 
when G is a tree. ] 


5.4.7. Existence and construction of a spanning forest 


So we have learnt that connected graphs have spanning trees. What do discon- 
nected graphs have? 


| Corollary 5.4.12. Each multigraph has a spanning forest. 


Proof. Apply Theorem to each component of the multigraph. Then, com- 
bine the resulting spanning trees into a spanning forest. O 


5.5. Centers of graphs and trees 


5.5.1. Distances 


Given a graph, we can define a “distance” between any two of its vertices, 
simply by counting edges on the shortest path from one to the other: 


Definition 5.5.1. Let G be a multigraph. 

For any two vertices u and v of G, we define the distance between u and 
v to be the smallest length of a path from u to v. If no such path exists, then 
this distance is defined to be oo. 

The distance between u and v is denoted by d (u,v) or by dg (u,v) when 
the graph G is not clear from the context. 


Example 5.5.2. If G is the multigraph from Example then 


dg (1,9) = 4, dg (4,13) = 2, dg (4,4) =0. 


Remark 5.5.3. Distances in a multigraph satisfy the rules that you would 
expect a distance function to satisfy: 


(a) We have d (u, u) = 0 for any vertex u. 
(b) We have d (u,v) = d (v, u) for any vertices u and v. 


(c) We have d (u,v) +d (v, w) > d (u, w) for any vertices u, v and w. (Here, 
we understand that œ > m and œ + m = œ for any m € N.) 


Also: 
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(d) The distances d (u,v) do not change if we replace “path” by “walk” in 
the definition of the distance. 


(e) If V is the vertex set of our multigraph, then d (u,v) < |V| — 1 for any 
vertices u and v. 


Proof. Part (d) follows from Corollary [3.3.10] The proofs of (a), (b) and (c) are 
then straightforward (the proof of (c) relies on part (d), because splicing two 
paths generally only yields a walk, not a path). Finally, in order to prove part 
(e), observe that any path of our multigraph has length < |V| — 1 (since its 
vertices are distinct). o 


We note that the definition of a distance becomes simpler if our multigraph 
is a tree: Namely, if T is a tree, then the distance d (u,v) between two vertices 
u and v is the length of the only path from u to v in T. Thus, in a tree, we do 
not have to worry whether a given path is the shortest. 

We also notice that if G is a multigraph, and if u and v are two vertices of 
G, then the distance dg (u,v) in G equals the distance d¢simp (u, v) in the simple 
graph G®™P., (The reason for this is that any path of G can be converted into 
a path of G®P having the same length, and vice versa. Of course, this is not 
a one-to-one correspondence, but it suffices for our purposes.) Thus, when 
studying distances on a multigraph, we can WLOG restrict ourselves to simple 
graphs. 


The following few exercises give some curious properties of distances in var- 
ious kinds of graphs. 


Exercise 5.10. Let a, b and c be three vertices of a connected multigraph 
G = (V,E,@). Prove that d (b,c) +d (c,a) +d (a,b) < 2|V| — 2. 


[Solution: This is Exercise 7 on midterm #1 from my Spring 2017 course, 
except that the simple graph has been replaced by a multigraph (but this 


makes no serious difference); see the course page for solutions. ] 


Exercise 5.11. Let a, b and c be three vertices of a strongly connected multi- 
digraph D = (V, A, y) such that |V| > 4. For any two vertices u and v of D, 
we define the distance d (u,v) to be the smallest length of a path from u to v. 
(This definition is the obvious analogue of Definition |5.5.1|for digraphs.) 


(a) Prove that d (b,c) +d (c,a) +d (a,b) < 3|V|—4. 


(b) For each n > 5, construct an example in which |V| = n and d (b,c) + 
d(c,a) +d (a,b) =3|V| —4. (No proof is required for the example.) 


[Solution: This is Exercise 5 on homework set #3 from my Spring 2017 
course, except that the simple digraph has been replaced by a multidigraph 


(but this makes no serious difference); see the course page for solutions. ] 
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Exercise 5.12. Let G be a tree. Let x, y, z and w be four vertices of G. 
Show that the two largest ones among the three numbers 


d(x,y)+d(z,w), d(x,z) +d (y, w) and d (x,w) +d (y,z) 


are equal. 


[Solution: This is Exercise 6 on midterm #2 from my Spring 2017 course; 


see the course page for solutions.] 


Exercise 5.13. Let G be a connected multigraph. Let x, y, z and w be four 
vertices of G. 
Assume that the two largest ones among the three numbers 


d(x,y)+d(z,w), d(x,z)+d(y,w) and d(x,w)+d(y,z) 


are not equal. 
Prove that G has a cycle of length < d (x,z) +d (y,w) +d (x,w)+d(y,z). 


[Hint: This is a strengthening of Exercise Try deriving it by applying 
the latter exercise to a strategically chosen subgraph of G.] 


[Solution: This is Exercise 1 on midterm #3 from my Spring 2017 course; 


see the course page for solutions. ] 


5.5.2. Eccentricity and centers 


We can now define “eccentricities”: 


Definition 5.5.4. Let v be a vertex of a multigraph G = (V, E, ). The eccen- 
tricity of v (with respect to G) is defined to be the number 


max {d(v,u) | ue V} CE NU {oo}. 


This eccentricity is denoted by ecc v or eccg v. 


Definition 5.5.5. Let G = (V,E,g) be a multigraph. Then, a center of G 
means a vertex of G whose eccentricity is minimum (among all vertices). 


(Some authors have a slightly different definition of a “center”: They define 
the center of G to be the set of all vertices of G whose eccentricity is minimum. 
That is, what they call “center” is the set of what we call “centers”.) 
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Example 5.5.6. Let G be the following multigraph: 


Then, the eccentricities of its vertices are as follows (we are just labeling each 


vertex with its eccentricity): 


Thus, the centers of G are the vertices r and v. 


Example 5.5.7. Let G be a complete graph Ky, (with n vertices). Then, each 
vertex of G has the same eccentricity (which is 1 if n > 2 and 0 if n = 1), and 
thus each vertex of G is a center of G. 


Example 5.5.8. Let G be a graph with more than one component. Then, each 
vertex v of G has eccentricity œ (because there exists at least one vertex u 
that lies in a different component of G than v, and thus this vertex u satisfies 
d(v,u) = œ). Hence, each vertex of G is a center of G. 


5.5.3. The centers of a tree 


As we see from Example 5.5.8] eccentricity and centers are not very useful 
notions when the graph is disconnected. Even for a connected graph, Example 
shows that the centers do not necessarily form a connected subgraph. 
However, in a tree, they behave a lot better: 


Theorem 5.5.9. Let T be a tree. Then: 


(a) The tree T has either 1 or 2 centers. 
(b) If T has 2 centers, then these 2 centers are adjacent. 


(c) Moreover, these centers can be found by the following algorithm: 


If T has more than 2 vertices, then we remove all leaves from T (simul- 
taneously). What remains is again a tree. If that tree still has more than 
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2 vertices, we remove all leaves from it (simultaneously). The result is 
again a tree. If that tree still has more than 2 vertices, we remove all 
leaves from it (simultaneously), and continue doing so until we are left 
with a tree that has only 1 or 2 vertices. These vertices are the centers 
of T. 


To prove Theorem [5.5.9 we first study how a tree is affected when all its 
leaves are removed: 


Lemma 5.5.10. Let T = (V, E, ọ) be a tree with more than 2 vertices. 

Let L be the set of all leaves of T. 

Let T \ L be the induced submultigraph of T on the set V \ L. (Thus, T \ L 
is obtained from T by removing all the vertices in L and all adjacent that 
contain a vertex in L.) 

Then: 


(a) The multigraph T \ L is a tree. 
(b) For any u € V \ L and v € V \ L, we have 
{paths of T from u to v} = {paths of T \ L from u to v} 


(that is, the paths of T from u to v are precisely the paths of T \ L from 
u to v). 


(c) For any u € V \ L and v € V \ L, we have dr (u,v) = dy; (u,v). 
(d) Each vertex v € V \ L satisfies eccer v = eccp\, v + 1. 


(e) Each leaf v € L satisfies eccrv = eccrw + 1, where w is the unique 
neighbor of v in T. (A neighbor of v means a vertex that is adjacent to 
v.) 


(£) The centers of T are precisely the centers of T \ L. 
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Example 5.5.11. Let T be the following tree: 


Then, the set L from Lemma [5.5.10] is {4,5,7,8,10,11}, and the tree T \ L 
looks as follows: 


Proof of Lemma[5.5.10) First, we notice that T is a forest (since T is a tree), and 

thus has no cycles. In particular, T therefore has no loops and no parallel edges. 

Also, for any two vertices u and v of T, there is a unique path from u to v in T. 
Next, we introduce some terminology: If p is a path of some multigraph, then 

an intermediate vertex of p shall mean a vertex of p that is neither the starting 

point nor the ending point of p. In other words, if p = (po, e1, P1, €2, P2, - - +1 €k, Pk) 

is a path of some multigraph, then the intermediate vertices of p are p1, p2, . . - , Pk—1- 

Clearly, any intermediate vertex of a path p must have degree > 2 (since the 

path p enters it along some edge, and leaves it along another). Hence, if p is a 

path of T, then 


any intermediate vertex of p must belong to V \ L (12) 


(because it must have degree > 2, thus cannot be a leaf of T; but this means 
that it cannot belong to L; therefore, it must belong to V \ L). 


(b) Let u € V \ Land v € V \ L. Let p be a path of T from u to v. We shall 
show that p is a path of T \ L as well. 

Indeed, let us first check that all vertices of p belong to V \ L. This is clear for 
the vertices u and v (since u € V \ L and v € V \ L); but it also holds for every 
intermediate vertex of p (by (12)). Thus, it does indeed hold for all vertices of 


P- 
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We have thus shown that all vertices of p belong to V \ L. Hence, p is a path 
of T \ L (since T \ L is the induced submultigraph of T on the set V \ L). 

Forget that we fixed p. We have thus shown that every path p of T from u to 
v is also a path of T \ L. Hence, 


{paths of T from u to v} C {paths of T \ L from u to v}. 
Conversely, we have 
{paths of T \ L from u to v} C {paths of T from u to v}, 


since every path of T \ L is a path from T (because T \ L is a submultigraph of 
T). Combining these two facts, we obtain 


{paths of T from u to v} = {paths of T \ L from u to v}. 


This proves Lemma 9.5.10) (b). 


(c) This follows from Lemma [5.5.10] (b), since the distance dg (u,v) of two 
vertices u and v in a graph G is defined to be the smallest length of a path from 
u to v. 


(a) The graph T is a tree, thus a forest. Hence, its submultigraph T \ L is a 
forest as well (since any cycle of T \ L would be a cycle of T). It thus remains 
to show that T \ L is connected. 

First, it is easy to see that T \ L has at least one vertex®4 It remains to show 
that any two vertices of T \ L are path-connected. 

Let u and v be two vertices of T \ L. Then, u € V \ Land v € V \ L. Hence, 
Lemma b.5.10] (b) yields 


{paths of T from u to v} = {paths of T \ L from u to v}. 


Thus, {paths of T \ L from u to v} = {paths of T from u to v} ¥ Ø (since there 
exists a path of T from u to v (because T is connected)). In other words, there 
exists a path of T \ L from u to v. In other words, u and v are path-connected 
in T\L. 

We have now shown that any two vertices u and v of T \ L are path-connected 
in T \ L. This entails that T \ L is connected (since T \ L has at least one vertex). 
This proves Lemma (a). 


36 Proof. We assumed that T has more than 2 vertices. In other words, there exist three distinct 
vertices u,v,w of T. Consider these u,v,w. If all three distances dr (u,v), dr (v,w) and 
dr (w,u) were equal to 1, then T would have a cycle (of the form (u, *,v,*,w,*,U), where 
each asterisk stands for some edge); but this would contradict the fact that T has no cycles. 
Thus, not all of these three distances are equal to 1. Hence, at least one of them is Æ 1. 
WLOG assume that dr (u,v) # 1 (otherwise, we permute u,v,w). Hence, the path from 
u to v has more than one edge (indeed, it must have at least one edge, since u and v are 
distinct). Therefore, this path has at least one intermediate vertex. This intermediate vertex 
then must belong to V \ L (by (12)). Hence, it is a vertex of the subgraph T \ L. This shows 
that T \ L has at least one vertex. 
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(d) If u and v are two vertices of T \ L, then the two distances dr (u,v) and 
dr; (u,v) are equal (by Lemmal[5.5.10](c)); thus, we shall denote both distances 
by d (u,v) (since there is no confusion to be afraid of). 

Let v € V \ L. We must show that eccr v = ecc Y + 1. 

Let u be a vertex of T \ L such that d(v,u) is maximum. Thus, eccp\, 0 = 
d(v,u) (by the definition of eccy; v). However, u is a vertex of T \ L, and thus 
does not belong to L. Hence, u is not a leaf of T (since L is the set of all leaves 
of T). Hence, u has degree > 2 in T (since a vertex in a tree with more than 1 
vertex cannot have degree 0). 

Now, consider the path p from v to u in the tree T. This path p has length 
d(v,u). Since u has degree > 2, there exist at least two edges of T that contain 
u. Hence, in particular, there exists at least one edge f that contains u and is 
distinct from the last edge of p Consider this edge f. Let w be the endpoint 
of f other than u. Appending f and w to the end of the path p, we obtain a 
walk from v to w. This walk is backtrack-free (since f is distinct from the last 
edge of p) and thus must be a path (by Proposition |5.1.2} since T has no cycles). 
This path has length d (v, u) + 1 (since it was obtained by appending an edge 
to the path p, which has length d (v,u)). Hence, d(v,w) =d(v,u) +1. But the 
definition of eccentricity yields 


eccpv > d(v,w) = d(v,u) +1 =ecer,v+1. (13) 
—— \ 


=eccy\1 Y 


On the other hand, let x be a vertex of T such that d (v, x) is maximum. Thus, 
eccr v = d (v, x) (by the definition of eccy v). The path from v to x has length 
> 1 (since otherwise, we would have x = v and therefore d (v, x) = d(v,v) = 0, 
which would easily contradict the maximality of d (v, x)). Thus, it has a second- 
to-last vertex. Let y be this second-to-last vertex. Then, the path from v to 
y is simply the path from v to x with its last edge removed. Consequently, 
d(v,y) = d(v,x) —1. However, it is easy to see that y € V\L B3. In other 
words, y is a vertex of T \ L. Thus, the definition of eccentricity yields 


ecc Y > d(v,y) =d(v,x)-1l=eccrov—1, 
T\ y 


=eccr V 


so that eccrv < eccp\,v +1. Combining this with (13), we obtain eccrv = 
eccr\ v + 1. This proves Lemma [5.5.10] (d). 


37Tf the path p has no edges, then f can be any edge that contains u. 
38Proof. Assume the contrary. Thus, y ¢ V \ L. Hence, y + v (since y ¢ V \ L but v € V \ L). 
However, y is the second-to-last vertex of the path from v to x. Therefore, y is either 
the starting point v of this path, or an intermediate vertex of this path. Since y 4 v, we 
thus conclude that y is an intermediate vertex of this path. Hence, by (12), we see that y 
must belong to V \ L. But this contradicts y ¢ V \ L. This contradiction shows that our 
assumption was false, ged. 
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(e) If u and v are two vertices of T \ L, then the two distances dr (u,v) and 
dr; (u,v) are equal (by Lemmal[5.5.10](c)); thus, we shall denote both distances 
by d (u,v) (since there is no confusion to be afraid of). 

Let v € L be a leaf. Let w be the unique neighbor of v in T. We must prove 
that eccr v = eccrw +1. 

We first claim that 


d(v,u) =d(w,u)+1 for each u € V \ {v}. (14) 


[Proof of (14): We have degv = 1 (since v is a leaf). In other words, there is a 
unique edge of T that contains v. Let e be this edge. The endpoints of e are v 
and w (since w is the unique neighbor of v). Thus, v # w (since T has no loops) 
and d (v, w) = 1. 

Now, let u € V \ {v}. Then, the path from v to u in T must have length > 1 
(since u Æ v), and therefore must begin with the edge e (since e is the only edge 
that contains v). If we remove this edge e from this path, we thus obtain a path 
from w to u. As a consequence, the path from v to u is longer by exactly 1 edge 
than the path from w to u. In other words, we have d (v, u) = d (w, u) +1. This 
proves (14).] 


Now, the definition of eccentricity yields 
eccr v = max {d (v,u) | u € V}. (15) 


This maximum is clearly not attained for u = v (since d (v,v) = 0 is smaller 
than d (v, w) = 1). Thus, this maximum does not change if we remove v from 
its indexing set V. Hence, rewrites as 


eccrv = max {| d(v,u) | ue V\ {ov} 
—— 


=d(w,u)+1 
(by (14) 
= max {d(w,u) +1 | ue V\ {oh 
= max {d(w,u) | ue V\ {o}} +1. (16) 


On the other hand, the definition of eccentricity yields 
eccr w = max {d(w,u) | ue V}. (17) 


We shall now show that this maximum does not change if we remove v from 
its indexing set V. In other words, we shall show that 


max {d(w,u) | u E€ V} = max {d (w,u) | ue V\ {o}}. (18) 


[Proof of (18): Assume that (18) is false. Then, the maximum max {d (w,u) | u € V} 
is attained only at u = v. In other words, we have 


d(w,v) >d(w,u) for all u € V \ {v}. (19) 
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However, the tree T has more than 2 vertices. Thus, it has a vertex u that is 
distinct from both v and w. Consider this u. Thus, u € V \ {v}, so that 
yields d(w,v) > d(w,u). In view of d(w,v) = d(v,w) = 1, this rewrites as 
1 > d(w,u), so that d(w,u) < 1. Therefore, w = u. But this contradicts the 
facts that w is distinct from u. This contradiction shows that our assumption 
was false, and thus is proved. ] 


Now, becomes 


eccr Vv = max {d(w,u) | ue V\{v}} +1 
a 
=max{d(w,u) | ueV} 
(by (18) 
= max {d (w,u) | u € V}41=ecerpw+l. 
s M 


(by CD) 
This proves Lemma [9.5.10] (e). 


(f) Lemma (e) shows that any vertex v € L has a higher eccentricity 
than its unique neighbor. Thus, a vertex v of T that minimizes eccrv cannot 
belong to L. In other words, a vertex v of T that minimizes eccy v must belong 
to V \ L. 

However, the centers of T are defined to be the vertices of T that minimize 
eccr v. As we just proved, these vertices must belong to V \ L. Thus, the centers 
of T can also be characterized as the vertices v € V \ L that minimize eccp v. 
However, a vertex v € V \ L minimizes eccr v if and only if it minimizes eccr\ , V 
(because Lemma [5.5.10] (d) yields eccrv = eccr\;, V + 1 for any such vertex v). 
Thus, we conclude that the centers of T can be characterized as the vertices 
v € V\L that minimize eccy\;v. But this is precisely the definition of the 
centers of T \ L. As a consequence, we see that the centers of T are precisely 
the centers of T \ L. This proves Lemma [5.5.10] (£). oO 


Proof of Theorem |5.5.9| We shall prove parts (a) and (b) of Theorem by 
strong induction on |V (T)|: 

Induction step: Consider a tree T. Assume that parts (a) and (b) of Theorem 
5.5.9]are true for any tree with fewer than |V (T)| many vertices. We must now 
prove these parts for our tree T. 

If |V(T)| < 2, then both parts are obvious. Hence, WLOG assume that 
|V(T)| > 2. Thus, the tree T has more than 2 vertices. Let L be the set of all 
leaves of T. Note that |L| > 2 (since we know that any tree with at least 2 
vertices has at least 2 leaves). Define the multigraph T \ L as in Lemma b.5.10 
Then, Lemma [5.5.10] (f) shows that the centers of T are precisely the centers of 
T\L. 

However, Lemma (a) yields that T \ L is again a tree. This tree has 
fewer vertices than T (since |L| > 2 > 0). Hence, by the induction hypothesis, 
both parts (a) and (b) of Theorem are true for the tree T \ L instead of T. 
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In other words, the tree T \ L has either 1 or 2 centers, and if it has 2 centers, 
then these 2 centers are adjacent. Since the centers of T are precisely the centers 
of T \ L, we can rewrite this as follows: The tree T has either 1 or 2 centers, 
and if it has 2 centers, then these 2 centers are adjacent. In other words, parts 
(a) and (b) of Theorem 5.5.9) hold for our tree T. This completes the induction 
step. Thus, parts (a) and (b) of Theorem [5.5.9|are proved. 


(c) This follows from Lemma (f). Indeed, if T has at most 2 vertices, 
then all vertices of T are centers of T (this is trivial to check). If not, then each 
“leaf-removal” step of our algorithm leaves the set of centers of T unchanged 
(by Lemma [5.5.10] (f)), and thus the centers of the original tree T are precisely 
the centers of the tree that remains at the end of the algorithm. But the latter 
tree has at most 2 vertices, and thus its centers are precisely its vertices. So the 
centers of T are precisely the vertices that remain at the end of the algorithm. 


Theorem (c) is proven. O 


The following exercise shows another approach to the centers of a tree: 


Exercise 5.14. Let T be a tree. Let p = (po, *, P1, *, P2, . - -, *, Pm) be a longest 
path of T. (We write asterisks for the edges since we don’t need to name 
them.) 

Prove the following: 


(a) If m is even, then the only center of T is pm/2- 


(b) If m is odd, then the two centers of T are P(m—1)/2 and P(m-+1)/2: 


Remark 5.5.12. Exercise |5.14/is a result by Arthur Cayley from 1875. It shows 
once again that each tree has exactly one center or two adjacent centers, and 
also shows that any two longest paths of a tree have a common vertex. 


The notion of a centroid of a tree is a relative of the notion of a center. We 
briefly discuss it in the following exercise: 


Exercise 5.15. Let T be a tree. For any vertex v of T, we let cy denote the size 
of the largest component of the graph T \ v. (Recall that T \ v is the graph 
obtained from T by removing the vertex v and all edges that contain v. Note 
that a component (according to our definition) is a set of vertices; thus, its 
size is the number of vertices in it.) 

The vertices v of T that minimize the number c, are called the centroids of 
T. 


(a) Prove that T has no more than two centroids, and furthermore, if T has 
two centroids, then these two centroids are adjacent. 
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(b) Find a tree T such that the centroid(s) of T are distinct from the center(s) 
of T. 


[Example: Here is an example of a tree T, where each vertex v is labelled 
with the corresponding number cy: 


Thus, the vertex labelled 5 is the only centroid of this tree T.] 


Note the analogy between Exercise (a) and Theorem (a) and (b). 


5.6. Arborescences 
5.6.1. Definitions 


Enough about undirected graphs. 
What would be a directed analogue of a tree? I.e., what kind of digraphs 
play the same role among digraphs that trees do among undirected graphs? 
Trees are graphs that are connected and have no cycles. This suggests two 
directed versions: 


e We can study digraphs that are strongly connected and have no cycles. 
Unfortunately, there is not much to study: Any such digraph has only 1 
vertex and no arcs. (Make sure you understand why!) 


e We can drop the connectedness requirement. Digraphs that have no cy- 
cles are called acyclic, and more typically they are called dags (short for 
“directed acyclic graphs”). 


However, these dags aren’t quite like trees. For example, a tree always has 


An introduction to graph theory, version August 2, 2023 page 185 


fewer edges than vertices, but a dag can have more arcs than vertices P| 
Here is a more convincing analogue of trees for digraphs!) 


Definition 5.6.1. Let D be a multidigraph. Let r be a vertex of D. 


(a) We say that r is a from-root (or, short, root) of D if for each vertex v of 
D, the digraph D has a path from r to v. 


(b) We say that D is an arborescence rooted from r if r is a from-root of D 
and the undirected multigraph D™4 has no cycles. (Recall that D'"¢ is 
the multigraph obtained from D by turning each arc into an undirected 
edge. Parallel arcs are not merged into one!) 


Of course, there are analogous notions of a “to-root” and an “arborescence 
rooted towards r”, but these are just the same notions that we just defined with 
all arrows reversed. So we need not study them separately; we can just take 
any property of “rooted from” and reverse all arcs to make it into a property of 
“rooted to”. 


Example 5.6.2. The multidigraph 


has three from-roots (namely, 0, 1 and 2). It is not an arborescence rooted 
from any of them, because turning each arc into an undirected edge yields a 
graph with a cycle. 


3°For example, here is a dag with 4 vertices and 5 arcs: 


x) 


40We recall that we defined a multigraph D™d for every multidigraph D (in Definition 
(44.1). Roughly speaking, this multigraph D™4 is obtained by “forgetting the di- 
rections” of the arcs of D. Parallel arcs are not merged into one. For example, 
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If we reverse the arc from 0 to 1, then we obtain a multidigraph 


which has only one from-root (namely, 1) and is still not an arborescence (for 
the same reason as before). 


Example 5.6.3. Consider the following multidigraph: 


This is an arborescence rooted from 6. Indeed, it has paths from 6 to all 
vertices, and turning each arc into an undirected edge yields a tree. 
If we reverse the arc from 1 to 2, we obtain a multidigraph 


y 


which is not an arborescence, because it has no from-root anymore. 
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5.6.2. Arborescences vs. trees: statement 


The above examples suggest that an arborescence rooted from r is basically 
the same as a tree, whose all edges have been “oriented away from r”. More 
precisely: 


Theorem 5.6.4. Let D be a multidigraph, and let r be a vertex of D. Then, 


the following two statements are equivalent: 


e Statement C1: The multidigraph D is an arborescence rooted from r. 

e Statement C2: The undirected multigraph D4 is a tree, and each arc 
of D is “oriented away from r” (this means the following: the source of 
this arc lies on the unique path between r and the target of this arc on 
pend), 


This is an easy theorem to believe, but an annoyingly hard one to formally 
prove in full detail! We shall prove this theorem later. 
5.6.3. The arborescence equivalence theorem 


First, let us show another bunch of equivalent criteria for arborescences, imitat- 
ing the tree equivalence theorem (Theorem [5.2.4): 


Theorem 5.6.5 (The arborescence equivalence theorem). Let D = (V, A, 4) 
be a multidigraph with a from-root r. Then, the following six statements are 
equivalent: 


e Statement A1: The multidigraph D is an arborescence rooted from r. 
e Statement A2: We have |A| = |V| — 1. 
e Statement A3: The multigraph D"4 is a tree. 


e Statement A4: For each vertex v € V, the multidigraph D has a unique 
walk from r to v. 


e Statement A5: If we remove any arc from D, then the vertex r will no 
longer be a from-root of the resulting multidigraph. 


e Statement A6: We have deg r = 0, and each v € V \ {r} satisfies 
deg v=1. 


Proof. We will prove the implications Al—>A4— > A5=> A6= > A2=> A3=> Al. 
Since these implications form a cycle that includes all six statements, this will 
entail that all six statements are equivalent. 
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Before we prove these implications, we introduce a notation: If a is any arc 
of D, then D \ a shall denote the multidigraph obtained from D by removing 


this arc a. (Formally, this means that D \ a := (v, A\ {a}, p la\(a}) ) 
We now come to the proofs of the promised implications. 


Proof of the implication AlI—+A4: Assume that Statement A1 holds. Thus, D 
is an arborescence rooted from r. In other words, r is a from-root of D and the 
undirected multigraph D™4 has no cycles. 

We must show that for each vertex v € V, the multidigraph D has a unique 
walk from r to v. The existence of such a walk is clear (because r is a from-root 
of D). It is the uniqueness that we need to prove. 

Assume the contrary. Thus, there exists a vertex v € V such that two distinct 
walks u and v from r to v exist. However, the multigraph D has no loops (since 
any loop of D would be a loop of D™4, and thus create a cycle of D™4, but 
we know that D™4 has no cycles). Hence, any walk of D is automatically a 
backtrack-free walk of D'"4 (indeed, it is backtrack-free because the only way 
two consecutive arcs of a walk in a digraph can be equal is if they are loops). 
Therefore, the two walks u and v of D are two backtrack-free walks of D™4, 
Thus, there are two distinct backtrack-free walks from r to v in D™4 (namely, 
u and v). Theorem [5.1.3] thus lets us conclude that D™4 has a cycle. But this 
contradicts the fact that D™4 has no cycles. 

This contradiction shows that our assumption was wrong. Hence, we have 
proved that for each vertex v € V, the multidigraph D has a unique walk from 
r to v. In other words, Statement A4 holds. 


Proof of the implication A4— >A5: Assume that Statement A4 holds. 

Let now a be any arc of D. We shall show that r is not a from-root of the 
multidigraph D \ a. 

Indeed, let s be the source and t the target of the arc a. We shall show that 
the digraph D \ a has no path from r to t. 

Indeed, assume the contrary. Thus, D \ a has some path p from r to t. This 
path does not use the arc a (since it is a path of D \ a). 

On the other hand, we have assumed that Statement A4 holds. Applying this 
statement to v = s, we conclude that the multidigraph D has a unique walk 
from r to s. Let (vo, 41,01, 42, V2, . - - , Ak, Vk) be this walk. By appending the arc a 
and the vertex t to its end, we extend it to a longer walk 


(vo, A1, V1, 42, V2, - - -s Akr OK, A, t) 4 


which is a walk from r to t. We denote this walk by q. 

We have now found two walks from r to t in the digraph D: namely, the 
path p and the walk q. These two walks are distinct (since q uses the arc a, 
but p does not). However, Statement A4 (applied to v = t) yields that the 
multidigraph D has a unique walk from r to t. This contradicts the fact that we 
just have found two distinct such walks. 
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This contradiction shows that our assumption was false. Hence, the digraph 
D \ a has no path from r to t. Thus, r is not a from-root of D \ a. 

Forget that we fixed a. We have now proved that if a is any arc of D, then r 
is not a from-root of D \ a. In other words, if we remove any arc from D, then 
the vertex r will no longer be a from-root of the resulting multidigraph. Thus, 
Statement A5 holds. 


Proof of the implication As >A6: Assume that Statement A5 holds. We must 
prove that Statement A6 holds. In other words, we must prove that deg” r = 0, 
and that each v € V \ {r} satisfies deg” v = 1. 

Let us first prove that deg r = 0. Indeed, assume the contrary. Thus, 
deg r # 0, so that there exists an arc a with target r. We shall show that r 
is a from-root of D \ a. 

The arc a has target r. Thus, a path that starts at r cannot use this arc a 
(because this arc would lead it back to r, but a path is not allowed to revisit 
any vertex), and therefore must be a path of D \ a. Thus we have shown that 
any path of D that starts at r is also a path of D \ a. However, for each vertex 
v of D, the digraph D has a path from r to v (since r is a from-root of D). This 
path is also a path of D \ a (since any path of D that starts at r is also a path 
of D \ a). Thus, for each vertex v of D \ a, the digraph D \ a has a path from 
r to v. In other words, r is a from-root of D \ a. However, we have assumed 
that Statement A5 holds. Thus, in particular, if we remove the arc a from D, 
then the vertex r will no longer be a from-root of the resulting multidigraph. In 
other words, r is not a from-root of D \ a. But this contradicts the fact that r is 
a from-root of D \ a. 

This contradiction shows that our assumption was false. Hence, deg r = 0 
is proved. 

Now, let v € V \ {r} be arbitrary. We must show that deg” v = 1. 

Indeed, assume the contrary. Thus, deg v Æ 1. Using the fact that r is a 
from-root of D, it is thus easy to see that deg v > 2 Hence, there exist 
two distinct arcs a and b with target v. Consider these arcs a and b. 

We are in one of the following three cases: 

Case 1: The digraph D \ a has a path from r to v. 

Case 2: The digraph D \ b has a path from r to v. 

Case 3: Neither the digraph D \ a nor the digraph D \ b has a path from r to 
v. 

Let us first consider Case 1. In this case, the digraph D \ a has a path from r 
to v. Let p be such a path. 

We have assumed that Statement A5 holds. Thus, in particular, if we remove 
the arc a from D, then the vertex r will no longer be a from-root of the resulting 


41 Proof. Since r is a from-root of D, we know that the digraph D has a path from r to v. Since 
v Æ r (because v € V \ {r}), this path must have at least one arc. The last arc of this path 
is clearly an arc with target v. Thus, there exists at least one arc with target v. In other 
words, deg” v > 1. Combining this with deg” v Æ 1, we obtain deg’ v > 1. In other words, 
deg v>2. 
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multidigraph. In other words, r is not a from-root of D \ a. In other words, 
there exists a vertex w € V such that the digraph D \ a has no path from r to w 
(by the definition of a “from-root”). Consider this vertex w. 

The digraph D has a path q from r to w (since r is a from-root of D). Consider 
this path q. If the path q did not use the arc a, then it would be a path of D \ a 
as well, but this would contradict the fact that D \ a has no path from r to w. 
Thus, the path q must use the arc a. 

Consider the part of q that comes after the arc a. This part must be a path 
from v to w (since the arc a has target v, whereas the path q has ending point 
w). Let us denote this path by q’. Thus, the path q’ does not use the arc a (since 
it was defined as the part of q that comes after a). Hence, q’ is a path of D \ a. 

Now, we know that the digraph D \ a has a path p from r to v as well as a 
path q’ from v to w. Splicing these paths together, we obtain a walk p * q’ from 
r to w. So we know that D \ a has a walk from r to w. According to Corollary 
3.3.10] we thus conclude that D \ a has a path from r to w. This contradicts the 
fact that D \ a has no path from r to w. 

We have thus obtained a contradiction in Case 1. 

The same argument (but with the roles of a and b interchanged) results in a 
contradiction in Case 2. 

Let us finally consider Case 3. In this case, neither the digraph D \ a nor the 
digraph D \ b has a path from r to v. However, the digraph D has a path p 
from r to v (since r is a from-root of D). Consider this path p. If this path p did 
not use the arc a, then it would be a path of D \ a, but this would contradict 
our assumption that the digraph D \ a has no path from r to v. Thus, this path 
p must use the arc a. For a similar reason, it must also use the arc b. However, 
the two arcs a and b have the same target (viz., v) and thus cannot both appear 
in the same path (since a path cannot visit a vertex more than once). This 
contradicts the fact that the path p uses both arcs a and b. Hence, we have 
found a contradiction in Case 3. 

We have now found contradictions in all three Cases 1, 2 and 3. This contra- 
diction shows that our assumption was false. Hence, deg” v = 1 is proved. 

We have now proved that each v € V \ {r} satisfies deg” v = 1. Since we 
have also shown that deg” r = 0, we thus have proved Statement A6. 


Proof of the implication Ab-—+>A2: Assume that Statement A6 holds. We must 
prove that Statement A2 holds. However, Proposition yields 


|JAJ= }ł} de v= deyr + } deg v 


vEV as veV\{r} ON 
(by Statement A6) (by Statement A6) 
=0+ )) 1= YO 1=(V\{r}| =|V|-1. 
veV\{r} veV\{r} 


Hence, Statement A2 holds. 


Proof of the implication A2—+A3: Assume that Statement A2 holds. We must 
prove that Statement A3 holds. 
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For each v € V, the digraph D has a path from r to v (since r is a from-root 
of D). Thus, for each v € V, the graph D™4 has a path from r to v (since 
any path of D is a path of D™4). Therefore, any two vertices u and v of D™4 
are path-connected in D™4 (because we can get from u to v via r, according 
to the previous sentence). Therefore, the graph D™4 is connected (since it 
has at least one vertex“). Moreover, its number of edges is |A| = |V| —1 (by 
Statement A2). Therefore, the multigraph D™4 satisfies the Statement T4 of the 
tree equivalence theorem (Theorem [5.2.4). Consequently, it satisfies Statement 
T1 of that theorem as well. In other words, it is a tree. This proves Statement 
A3. 


Proof of the implication A3—+A1: Assume that Statement A3 holds. We must 
prove that Statement A1 holds. 

The multigraph D¥"¢ is a tree (by Statement A3), and thus is a forest; hence, 
it has no cycles. Since we also know that r is a from-root of D, we thus conclude 
that D is an arborescence rooted from r (by the definition of an arborescence). 
In other words, Statement A1 is satisfied. 


We have now proved all six implications in the chain 
Al=> A4=> A5=> A6= A2=—> A3= A1. Thus, all six statements A1, A2, 
..., A6 are equivalent. This proves Theorem [5.6.5 O 


Exercise 5.16. Let D = (V,A,¢) be a multidigraph that has no cycled#3} Let 
r € V be some vertex of D. Prove the following: 


(a) If deg” u > 0 holds for all u € V \ {r}, then r is a from-root of D. 


(b) If deg” u = 1 holds for all u € V \ {r}, then D is an arborescence rooted 
from r. 


5.7. Arborescences vs. trees 


Our next goal is to prove Theorem which connects arborescences with 
trees. 

To prove it formally, we introduce a few notations regarding trees. First, we 
recall the notion of a distance (Definition |5.5.1). We claim the following simple 
property of distances in trees: 


Proposition 5.7.1. Let T = (V,E,g) be a tree. Let r € V be a vertex of 
T. Let e be an edge of T, and let u and v be its two endpoints. Then, 
the distances d(r,u) and d(r,v) differ by exactly 1 (that is, we have either 
d(r,u) =d(r,v)+1ord(r,v) =d(r,u) +1). 


“This is because r € V. 
Recall that cycles in a digraph have to be directed cycles — i.e., each arc is traversed from its 
source to its target. 
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Proof. We recall that since T is a tree, the distance d (p,q) between two vertices 
p and q of T is simply the length of the path from p to q. (This path is unique, 
since T is a tree.) 

Let p be the path from r to u. Then, we are in one of the following two cases: 

Case 1: The edge e is an edge of p. 

Case 2: The edge e is not an edge of p. 

Consider Case 1. In this case, e must be the last edge of p (since otherwise, p 
would visit u more than once, but p cannot do this, since p is a path). Thus, if 
we remove this last edge e (and the vertex u) from p, then we obtain a path from 
r to v. This path is exactly one edge shorter than p. Thus, d (r,v) = d (r,u) —1, 
so that d(r,u) = d(r,v) + 1. So we are done in Case 1. 

Now, consider Case 2. In this case, the edge e is not an edge of p. Thus, we 
can append e and v to the end of the path p, and the result will be a backtrack- 
free walk p’. However, a backtrack-free walk in a tree is always a path (since 
otherwise, it would contain a cycle but a tree has no cycles). Thus, p’ is a 
path from r to v, and it is exactly one edge longer than p (by its construction). 
Therefore, d (r,v) = d (r,u) +1. So we are done in Case 2. 

Now, we are done in both cases, so that Proposition is proven. Oo 


Definition 5.7.2. Let T = (V, E, ọ) be a tree. Let r € V be a vertex of T. Let e 
be an edge of T. By Proposition the distances from the two endpoints 
of e to the vertex r differ by exactly 1. So one of them is smaller than the 
other. 


(a) We define the r-parent of e to be the endpoint of e whose distance to r 
r 


is the smallest. We denote this endpoint by e~”. 


(b) We define the r-child of e to be the endpoint of e whose distance to r is 
the largest. We denote this endpoint by eT”. 


Thus, by Proposition [5.7.1 we have 


d (r,e*") =d (r,e) +1. 


| Example 5.7.3. Here is a tree T, a vertex r, an edge e and its r-parent e~" and 


“by Proposition 5.1.2 


An introduction to graph theory, version August 2, 2023 page 193 


its r-child et": 


Definition 5.7.4. Let T = (V, E, @) be a tree. Letr € V bea vertex of T. Then, 
we define a multidigraph T"? by 


T"? := (V,E, 4), 


where y : E — V x V is the map that sends each edge e € E to the pair 
(e~",et"). Colloquially speaking, this means that T"? is the multidigraph 
obtained from T by turning each edge e into an arc from its r-parent e~' to 
its r-child et”. This is what we mean when we speak of “orienting each edge 
of T away from r” in Theorem |5.6.4 


Example 5.7.5. If T is the tree from Example then T”? is the following 
multidigraph: 


Now, Theorem can be rewritten as follows: 


Theorem 5.7.6. Let D be a multidigraph, and let r be a vertex of D. Then, 
the following two statements are equivalent: 


e Statement C1: The multidigraph D is an arborescence rooted from r. 


e Statement C2: The undirected multigraph D™*4 is a tree, and we have 
D = (D™4)"?. (This is a honest equality, not just some isomorphism.) 
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The proof of this theorem is best organized by splitting into two lemmas: 


Lemma 5.7.7. Let T = (V,E,@) be a tree. Let r € V be a vertex of T. Then, 
the multidigraph T’” is an arborescence rooted from r. 


Proof. The idea is to show that if p is a path from r to some vertex v in the tree 
T, then p is also a path in the digraph T"? , because all the edges of p have been 
“oriented correctly” (i.e., their orientation matches how they are used in p). 

Here are the details: Clearly, (T’*)""* = T. Hence, the graph (T’~*)""° is a 
tree and hence has no cycles. Thus, it suffices to prove that r is a from-root of 
T”. In other words, we must prove that 


T’~ has a path from r to v (20) 


for each v € V. 

We shall prove by induction on d (r,v) (where d means the distance on 
the tree T): 

Base case: If v € V satisfies d (r, v) = 0, then v = r, and thus T’” has a path 
from r to v (namely, the trivial path (r)). Thus, is proved for d (r,v) = 0. 

Induction step: Let k € IN. Assume (as the induction hypothesis) that 
holds for each v € V satisfying d (r,v) = k. We must now prove the same for 
each v € V satisfying d (r,v) =k +1. 

So let v € V satisfy d(r,v) = k +1. Then, the path of T from r to v has 
length k + 1. Let p be this path, let e be its last edge, and let u be its second- 
to-last vertex (so that its last edge e has endpoints u and v). Then, by removing 
the last edge e from the path p, we obtain a path from r to u that is one edge 
shorter than p. Hence, d (r,u) = d (r,v) —1 < d (r,v). Consequently, the edge 
e has r-parent u and r-child v (by Definition 5.7.2). In other words, e™" = u 
and e+” = v. Therefore, in the digraph T"?, the edge e is an arc from u to 
v (by Definition [5.7.4). Moreover, we have d(r,u) = d(r,v) —1 = k (since 
d(r,v) = k + 1); therefore, the induction hypothesis tells us that holds for 
u instead of v. In other words, T"? has a path from r to u. Attaching the arc 
e and the vertex v to this path, we obtain a walk of T"? from r to v (since e 
is an arc from u to vin T””). Thus, the digraph T"? has a walk from r to v, 
therefore also a path from r to v. Hence, holds for our v. This completes 
the induction step. 

Thus, is proved by induction. As we explained above, this yields Lemma 
5.7.7 0 


Lemma 5.7.8. Let D = (V, A, y) be an arborescence rooted from r (for some 
r € V). Leta € A be an arc of D. Let s be the source of a, and let t be the 
target of a. Then: 


(a) We have d (r,s) < d (r,t), where d means distance on the tree D™4, 


(b) In the multidigraph (D4) "the arc a has source s and target t. 
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Proof. (a) The vertex r is a from-root of D (since D is an arborescence rooted 
from r). Thus, D has a path from r to t. Let p be this path. Note that deg t > 1, 
since t is the target of at least one arc (namely, of a). 

The digraph D is an arborescence rooted from r, and thus satisfies Statement 
A6 in the arborescence equivalence theorem (Theorem 5.6.5). In other words, 
we have 


deg r=0 and deg” v = 1 for each v € V \ {r}. 


In particular, this entails deg” v < 1 for each v € V. Applying this to v = t, we 
obtain deg” t < 1. Hence, the arc a is the only arc whose target is t. 

We have t Æ r (since deg r = 0 but deg t > 1 > 0). Thus, the path p from r 
to t has at least one arc. Its last arc is therefore an arc whose target is t. Hence, 
this last arc is a (since a is the only arc whose target is t). 

If we remove this last arc from the path p, then we obtain a path p’ from r to 
s (since s is the source of a). 

However, each path of D is a path of D""¢_ Thus, in particular, p is a path of 
D™d from r to t, while p’ is a path of D™4 from r to s. Since p’ is exactly one 
edge shorter than p, we thus obtain d (r,s) = d (r,t) —1 < d (r,t). This proves 
Lemma (a). 


(b) The arc a of the digraph D has source s and target t. Hence, the edge a 
of the tree D™4 has endpoints s and t. Since d (r,s) < d (r,t) (by part (a)), this 
entails that its r-parent is s and its r-child is t (by Definition 5.7.2). Thus, in the 
digraph (pay, this edge a becomes an arc with source s and target t (by 
Definition 5.7.4). This proves Lemma [5.7.8] (b). oO 


Proof of Theorem [5.7.6 If (V, A, Y) is a multidigraph, then we shall refer to the 
map Ņ : A —> V x V (which determines the source and the target of each arc) 
as the “psi-map” of this multidigraph. 

Write the multidigraph D as D = (V, A, y). We shall now prove the implica- 
tions C1=—>C2 and C2=>C1 separately: 


Proof of the implication CI—+C2: Assume that Statement C1 holds. That is, 
D is an arborescence rooted from r. We must prove Statement C2. In other 
words, we must prove that the undirected multigraph D™4 is a tree, and that 
D= ( ps) 4 zA 

It is clear (by the definition of an arborescence) that D™4 is a tree. It thus 
remains to prove that D = (D¥"4)"”, 

The multidigraphs D and (D™4) "” have the same set of vertices (namely, V) 
and the same set of arcs (namely, A); we therefore just need to show that their 
psi-maps are the same. In other words, we need to show that y’ = y, where y’ 
is the psi-map of (D¥"4)"”, 

Leta € A be arbitrary. Let y (a) = (s,t). Thus, the arc a of D has source s and 
target t. Lemma (b) therefore shows that in the multidigraph (Dea 
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the arc a has source s and target t as well. In other words, y’ (a) = (s,t) (since 
y’ is the psi-map of this multidigraph). Hence, y’ (a) = (s,t) = y (a). 

Forget that we fixed a. We thus have shown that y’ (a) = y (a) for each 
a € A. In other words, y' = . As explained above, this completes the proof of 
Statement C2. 


Proof of the implication C2—+C1: Assume that Statement C2 holds. Thus, the 
undirected multigraph D™4 is a tree, and we have D = (D"¢)’”. Hence, 
Lemma (applied to T = D4) yields that the multidigraph (D""4)"” is 


an arborescence rooted from r. In other words, D is an arborescence rooted 
é r % 
from r (since D = (D™4) *). This shows that Statement C1 holds. 


Having now proved both implications C1—+C2 and C2=>C1, we conclude 
that Statements C1 and C2 are equivalent. Thus, Theorem is proved. O 


Oof. 
Let’s get one more consequence out of this. First, let us show that an arbores- 
cence can have only one root: 


Proposition 5.7.9. Let D be an arborescence rooted from r. Then, r is the 
only root of D. 


Proof of Proposition [5.7.9] Assume the contrary. Thus, D has another root s dis- 
tinct from r. Hence, D has a path from r to s (since r is a root) as well as a path 
from s to r (since s is a root). Combining these paths gives a circuit of length 
> 0. However, a circuit of length > 0 in a digraph must always contain a cycle 
(since Proposition |4.5.9|shows that it either is a path or contains a cycle; but it 
clearly cannot be a path). Hence, D has a cycle. Therefore, D™4 also has a cycle 
(since any cycle of D is a cycle of DY“). However, D™4 has no cycles (since 
D is an arborescence rooted from r). The preceding two sentences contradict 
each other. This shows that the assumption was wrong, and Proposition 
is proven. 0 


Definition 5.7.10. A multidigraph D is said to be an arborescence if there 
exists a vertex r of D such that D is an arborescence rooted from r. In this 
case, this r is uniquely determined as the only root of D (by Proposition 
5.7.9). 


Theorem 5.7.11. There are two mutually inverse maps 


{pairs (T,r) of a tree T and a vertex r of T} — {arborescences} , 
(T,r) > T? 


and 


{arborescences} — {pairs (T,r) of a tree T and a vertex r of T}, 


D => (p, vD) A 
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l where yD denotes the root of D. 
Proof. The map 


{pairs (T,r) of a tree T and a vertex r of T} — {arborescences}, 
(T,r) = T? 


is well-defined because of Lemma The map 


{arborescences} — {pairs (T,r) of a tree T and a vertex r of T}, 


D > Cae vD) ; 


is well-defined because if D is an arborescence, then D"¢ is a tree. In order to 
show that these two maps are mutually inverse, we must check the following 
two statements: 


a 


1. Each arborescence D satisfies (pee = D, where r is the root of D; 


2. Each pair (T,r) of a tree T and a vertex r of T satisfies (T’*)""* = T and 


(T>) = 


However, Statement 1 follows from Theorem (specifically, from the im- 
plication C1= C2 in Theorem [5.7.6). Statement 2 follows from Lemma [5.7.7] 


und 


(more precisely, the (T’~”) 


the Tr>yand _ + part follows from Lemma BZ). Thus, Theorem [5.7.11] is 
P 
proved. O 


= T part of Statement 2 is obvious, whereas 


Theorem [5.7.11|formalizes the idea that an arborescence is “just a tree with a 
chosen vertex”. For this reason, arborescences are sometimes called “oriented 
trees”, but this name is also shared with a more general notion, which is why I 
avoid it. 


Exercise 5.17. Let G = (V, E, ọ) be a connected multigraph such that |E| > 
|V|. Show that there exists an injective map f : V —> E such that for each 
vertex v € V, the edge f (v) contains v. 

(In other words, show that we can assign to each vertex an edge that con- 
tains this vertex in such a way that no edge is assigned twice.) 


5.8. Spanning arborescences 


In analogy to spanning subgraphs of a multigraph, we can define spanning 
subdigraphs of a multidigraph: 
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Definition 5.8.1. A spanning subdigraph of a multidigraph D = (V, A, 4) 
means a multidigraph of the form (V, B, wp |g), where B is a subset of A. 

In other words, it means a submultidigraph of D with the same vertex set 
as D. 

In other words, it means a multidigraph obtained from D by removing 
some arcs, but leaving all vertices untouched. 


Definition 5.8.2. Let D be a multidigraph. Let r be a vertex of D. A spanning 
arborescence of D rooted from r means a spanning subdigraph of D that is 
an arborescence rooted from r. 


Example 5.8.3. Let D = (V, A, y) be the following multidigraph: 


(v, {a,c e}, p lisa) E 


By abuse of notation, we shall refer to this spanning arborescence simply 
as {a,c,e} (since a spanning subdigraph of D is uniquely determined by its 
arc set). Another spanning arborescence of D rooted from 1 is {a,b,e}. Yet 
another is {a,b, f}. Anon-example is {a, d, f } (indeed, this is an arborescence 
rooted from 3, not from 1). 

Is there a spanning arborescence of D rooted from 2 ? Yes, for example 
{b,d, f}. 

Is there a spanning arborescence of D rooted from 4 ? No, since 4 is not a 
from-root of D. 


This illustrates a first obstruction to the existence of spanning arborescences: 
Namely, a digraph D can have a spanning arborescence rooted from r only if r 
is a from-root. This necessary criterion is also sufficient: 
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Theorem 5.8.4. Let D be a multidigraph. Let r be a from-root of D. Then, D 
has a spanning arborescence rooted from r. 


Proof. This is an analogue of the “every connected multigraph has a spanning 
tree” theorem (Theorem[5.4.6) that we proved in 4 ways. At least the first proof 
easily adapts to the directed case: 

Remove arcs from D one by one, but in such a way that the “rootness of r” 
(that is, the property that r is a root of our multidigraph) is preserved. So we 
can only remove an arc if r remains a root afterwards. 

Clearly, this removing process will eventually come to an end, since D has 
only finitely many arcs. Let D’ be the multidigraph obtained at the end of this 
process. Then, r is still a root of D’, but we cannot remove any more arcs from 
D’ without breaking the rootness of r. That is, if we remove any arc from Dp, 
then the vertex r will no longer be a from-root of the resulting multidigraph. 
This means that D’ satisfies Statement A5 from the arborescence equivalence 
theorem (Theorem 55.6.5). Thus, D’ satisfies Statement A1 as well (since all six 
statements A1, A2,..., A6 are equivalent). In other words, D’ is an arborescence 
rooted from r. Since D’ is a spanning subdigraph of D, we thus conclude that D 
has a spanning arborescence rooted from r (namely, D’). This proves Theorem 
5.8.4 0 


Question 5.8.5. Can the other three proofs of Theorem be adapted to 
Theorem too? 


Example 5.8.6. Let n be a positive integer. The n-cycle digraph Ca 
is defined to be the simple digraph with vertices 1,2,...,n and arcs 
12, 23, 34, ..., (n—1)n, n1. (Here is how it looks for n = 5: 


) 
Note that this digraph Ca is a directed analogue of the cycle graph Cy. As 


we recall from Example the cycle graph Cn has n spanning trees. 
In contrast, the digraph C „ has only one spanning arborescence rooted 


from 1. This spanning arborescence is the subdigraph of ec: obtained by 
removing the arc n1. 
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Proof. If we remove the arc n1 from Cw then we obtain the simple digraph E 
with vertices 1,2,...,n and arcs 12, 23, ..., (n —1)n. This digraph E is easily 
seen to be an arborescence rooted from 1 (indeed, 1 is a from-root of E, and the 
underlying undirected graph E™4 = P, has no cycles). Thus, E is a spanning 


arborescence of C „ rooted from 1. 
We shall now prove that it is the only such arborescence. Indeed, let F be 


any spanning arborescence of ce rooted from 1. Then, 1 is a from-root of 
F. Hence, for each vertex v € {2,3,...,n}, the digraph F must have a path 
from 1 to v, and thus must contain an arc with target v (namely, the last arc 
of this path). This arc must be (v — 1, v) (since this is the only arc of C „ with 
target v). Thus, for each vertex v € {2,3,...,n}, the digraph F must contain 
the arc (v—1,v). In other words, the digraph F must contain all n — 1 arcs 
12, 23, ..., (n—1)n. If F were to also contain the remaining arc n1 of ec. 
then the underlying undirected graph F™4 = C,, would contain a cycle, which 
would contradict F being an arborescence. Hence, F cannot contain the arc n1. 
Thus, F contains the n — 1 arcs 12, 23, ..., (1—1)n and no others. In other 
words, F = E. This shows that any spanning arborescence of C „ rooted from 


1 must be E. In other words, E is the only spanning arborescence of Ca rooted 
from 1. This completes the proof of Example oO 


5.9. The BEST theorem: statement 


We now come to something much more surprising. 

Recall that a multidigraph D = (V,A,ọ) is balanced if and only if each 
vertex v satisfies deg” v = deg v. This is necessary for the existence of an 
Eulerian circuit. If D is weakly connected, this is also sufficient (by Theorem 
[4.7.2] (a)). 


Surprisingly, there is a formula for the number of these Eulerian circuits: 


Theorem 5.9.1 (The BEST theorem). Let D = (V,A,w) be a balanced multi- 
digraph such that each vertex has indegree > 0. Fix an arc a of D, and let 
r be its target. Let tT(D,r) be the number of spanning arborescences of D 
rooted from r. Let e (D,a) be the number of Eulerian circuits of D whose last 


arc is a. Then, 
e(D,a) =t(D,r)- | | (deg™ u—1)!. 
uceV 


The “BEST” in the name of this theorem is an abbreviation for de Bruijn, van 
Aardenne-Ehrenfest, Smith and Tutte, who discovered it in the middle of the 
20th centur 


45More precisely, van Aardenne-Ehrenfest and de Bruijn discovered it in 1951 (see |VanEhr51 
§6]) generalizing an earlier result of Smith and Tutte. 
46We note that the number of Eulerian circuits of D whose last arc is a is precisely the number 
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To prove this theorem, we shall restate it in terms of “arborescences to” (as 
opposed to “arborescences from”). Mathematically speaking, this restatement 
isn’t really necessary (the argument is the same in both cases up to reversing 
the directions of all arcs), but it helps make the proof more intuitive, since it 
lets us build our Eulerian circuits by moving forwards rather than backwards. 


5.10. Arborescences rooted to r 


Here is the formal definition of “arborescences to”: 
Definition 5.10.1. Let D be a multidigraph. Let r be a vertex of D. 


(a) We say that r is a to-root of D if for each vertex v of D, the digraph D 
has a path from v tor. 


(b) We say that D is an arborescence rooted to r if r is a to-root of D and 
the undirected multigraph DY has no cycles. 


Clearly, Definition and Definition differ only in the direction of 
the arcs. In other words, if we reverse each arc of our digraph (turning its 
source into its target and vice versa), then a from-root becomes a to-root, and 
an arborescence rooted from r becomes an arborescence rooted to r, and vice 
versa. Thus, every property that we have proved for arborescences rooted from 
r can be translated into the language of arborescences rooted to r by reversing 
all arcs. 

If you want to see this stated more rigorously, here is a formal definition of “revers- 
ing each arc”: 


Definition 5.10.2. Let D = (V, A, y) be a multidigraph. Then, D" shall denote the 
multidigraph (V, A, To wp), where tT: V x V — V x V is the map that sends each 
pair (s,t) to (t,s). Thus, if an arc a of D has source s and target t, then it is also an 
arc of D", but in this digraph D"®Y it has source t and target s. 

The multidigraph D*” is called the reversal of the multidigraph D; we say that it 
is obtained from D by “reversing each arc”. 


This notion of “reversing each arc” allows us to reverse walks in digraphs: If w is a 
walk from a vertex s to t in some multidigraph D, then its reversal rev w (obtained by 
reading w backwards) is a walk from t to s in the multidigraph D"°Y. The same holds 
if we replace the word “walk” by “path”. Thus, we easily obtain the following: 


Proposition 5.10.3. Let D be a multidigraph. Let r be a vertex of D. Then: 


(a) The vertex r is a to-root of D if and only if r is a from-root of D"V. 


of all Eulerian circuits of D counted up to rotation. Indeed, each Eulerian circuit of D 
contains the arc a exactly once, and thus can be rotated in a unique way to end with a. 
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(b) The digraph D is an arborescence rooted to r if and only if D"Y is an arbores- 
cence rooted from r. 


Proof. Completely straightforward unpacking of the definitions. o 


Note that when we reverse each arc in a digraph D, the outdegrees of its 
vertices become their indegrees and vice versa. Hence, a balanced digraph D 
remains balanced when this happens. In particular, the BEST theorem (Theo- 
rem [5.9.1) thus gets translated as follows: 


Theorem 5.10.4 (The BEST’ theorem). Let D = (V, A, y) be a balanced mul- 
tidigraph such that each vertex has outdegree > 0. Fix an arc a of D, and let 
r be its source. Let tT(D,r) be the number of spanning arborescences of D 
rooted to r. Let e (D,a) be the number of Eulerian circuits of D whose first 


arc is a. Then, 
e(D,a) = t(D,r)-[] (deg* u —1)!. 
uEV 


We will soon prove Theorem [5.10.4] and then derive Theorem b.9.1]from it by 
reversing the arcs. 

First, however, let us state the analogue of the Arborescence Equivalence 
Theorem (Theorem for “arborescences rooted to r” (as opposed to “ar- 
borescences rooted from r”): 


Theorem 5.10.5 (The dual arborescence equivalence theorem). Let D = 
(V,A,w) be a multidigraph with a to-root r. Then, the following six state- 
ments are equivalent: 


e Statement A’l: The multidigraph D is an arborescence rooted to r. 
e Statement A’2: We have |A| = |V| — 1. 
e Statement A’3: The multigraph D""¢ is a tree. 


e Statement A’4: For each vertex v € V, the multidigraph D has a unique 
walk from v tor. 


e Statement A’5: If we remove any arc from D, then the vertex r will no 
longer be a to-root of the resulting multidigraph. 


e Statement A’6: We have deg’ r = 0, and each v € V \ {r} satisfies 
degtv=1. 


Proof. Upon reversing all arcs of D, this turns into the original Arborescence 
Equivalence Theorem (Theorem (5.6.5). O 
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5.11. The BEST theorem: proof 


We now come to the proof of the BEST theorem (Theorem |5.9.1). As we said, 
we proceed by proving Theorem [5.10.4] first. We first outline the idea of the 
proof; then we will give the details. 


Proof idea for Theorem[5.10.4) An a-Eulerian circuit shall mean an Eulerian cir- 
cuit of D whose first arc is a. 

Let e be an a-Eulerian circuit. Its first arc is a; therefore, its first and last 
vertex is r. 

Being an Eulerian circuit, e must contain each arc of D and therefore contain 
each vertex of D (since each vertex has outdegree > 0). For each vertex u Æ r, 
we let e (u) be the last exit of e from u, that is, the last arc of e that has source 
u. Let Exite be the set of these last exits e (u) for all vertices u 4 r. Then, we 
claim: 


Claim 1: This set Exit e (or, more precisely, the spanning subdigraph 
(V, Exite,  |gxite)) is a spanning arborescence of D rooted to r. 


Let’s assume for the moment that Claim 1 is proven. Thus, given any a- 
Eulerian circuit e, we have constructed a spanning arborescence of D rooted to 
r. 

How many a-Eulerian circuits e lead to a given arborescence in this way? 
The answer is rather nice: 


Claim 2: For each spanning arborescence (V, B, 4 |g) of D rooted to 


r, there are exactly [] (deg* u — 1)! many a-Eulerian circuits e such 
ucV 
that Exite = B. 


Let us again assume that this is proven. Combining Claim 1 with Claim 


2, we obtain a [] (deg* u —1)!-to-1 correspondence between the a-Eulerian 
ueV 
circuits and the spanning arborescences of D rooted to r. Thus, the number 


of the former is [] (deg* u—1)! times the number of the latter. But this is 


u 
precisely the claim of Theorem [b.10.4;| Hence, in order to prove Theorem [5.10.4 
it remains to prove Claim 1 and Claim 2. O 


Here is the complete proof: 


Proof of Theorem Some notations first: 

An outgoing arc from a vertex u will mean an arc whose source is u. An 
incoming arc into a vertex u will mean an arc whose target is u. 

An a-Eulerian circuit shall mean an Eulerian circuit of D whose first arc is a. 

A sparb shall mean a spanning arborescence of D rooted to r. 

A spanning subdigraph of D always has the form (V, B, p |g) for some subset 
B of A. Thus, it is uniquely determined by its arc set B. 
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Hence, from now on, we shall identify a spanning subdigraph (V, B, p |B) 
of D with its arc set B. Conversely, any subset B of A will be identified with 
the corresponding spanning subdigraph (V,B, |g) of D. Thus, for instance, 
when we say that a subset B of A “is a sparb”, we shall actually mean that the 
corresponding spanning subdigraph (V, B, |g) is a sparb. 

For each a-Eulerian circuit e, we define a subset Exite of A as follows: 

Let e be an a-Eulerian circuit. Its first arc is a; thus, its first and last vertex 
is r. Being an Eulerian circuit, e must contain each arc of D and therefore also 
contain each vertex of D (since each vertex of D has outdegree > 0). For each 
vertex u € V \ {r}, we let e (u) be the last exit of e from u; this means the last 
arc of e that has source u. We let Exite be the set of these last exits e (u) for 
all u € V \ {r}. Thus, we have defined a subset Exite of A for each a-Eulerian 
circuit e. 


Example 5.11.1. Here is an example of this construction: Let D be the multi- 
digraph 


with r = 1, and let e be the a-Eulerian circuit 
(1,a,2, b,3, c,4,d,5,e, Lg oe: h,5, i,5,j,2,k,4, L, 1) 


(we have deliberately named the arcs in such a way that they appear on an 
Eulerian circuit in alphabetic order). Then, 


aSk e(3) =h, e(4)=1, e(5) =j, 
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so that Exite = {k,h,1,j}. Here is Exite as a spanning subdigraph: 


Now, we claim the following: 
Claim 1: Let e be an a-Eulerian circuit. Then, the set Exit e is a sparb. 


Claim 2: For each sparb B (regarded as a subset of A), there are ex- 
actly [] (deg* u — 1)! many a-Eulerian circuits e such that Exite = 
V 


uE 
B. 


[Proof of Claim 1: The set Exite contains exactly one outgoing arc (namely, 
e (u)) from each vertex u € V \ {r}, and no outgoing arc from r. Thus, |Exite| = 
|V|—1. 

Let us number the arcs of e as 41, 42,...,4m, in the order in which they appear 
in e. (Thus, a; = a, since the first arc of e is a.) 

Recall that the arcs in Exite are the arcs e (u) for all u € V \ {r} (defined as 
above - i.e., the arc e (u) is the last exit of e from u). We shall refer to these arcs 
as the last-exit arcs. 

For each u € V \ {r}, we let j (u) be the unique number i € {1,2,...,m} such 
that e (u) = a;. (This i indeed exists and is unique, since each arc of D appears 
exactly once on e.) Thus, j (u) tells us how late in the Eulerian circuit e the arc 
e (u) appears. Since e (u) is the last exit of e from u, the Eulerian circuit e never 
visits the vertex u again after this. 

Thus, if a last-exit arc e (u) has target v Æ r, then 


j(u) <j(2) (21) 


(because the arc e (u) leads the circuit e into the vertex v, which the circuit then 
has to exit at least once; therefore, the corresponding last-exit arc e (v) has to 
appear later in e than the arc e (u)). 
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We shall now show that r is a to-root of Exite (that is, of the spanning subdi- 
graph (V, Exite, Ų |Exite)). To this purpose, we must show that for each vertex 
v € V, there is a path from v to r in the digraph (V, Exite, Y |gxite)- 

Indeed, let v € V be any vertex. We must find a path from v to r in the 
digraph (V, Exite, 4 |gxite). It will suffice to find a walk from v to r in this 
digraph (by Corollary 4.5.8). In other words, we must find a way to walk from 
v to r in D using last-exit arcs only. 

So we start walking at v. If v = r, then we are already done. Otherwise, we 
have v € V \ {r}, so that the arc e (v) and the number j (v) are well-defined. 
We thus take the arc e (v). This brings us to a vertex v’ (namely, the target of 
e (v)) that satisfies j (v) < j (v') (by (21)). If this vertex v’ is r, then we are done. 
If not, then e (v’) and j (v’) are well-defined, so we continue our walk by taking 
the arc e (v’). This brings us to a further vertex v” (namely, the target of e (v’)) 
that satisfies j(v’) < j (v”) (by @1)). If this vertex v” is r, then we are done. 
Otherwise, we proceed as before. We thus construct a walk 


(v,e(v),v',e(v'),v",e(v"),...) 


that either goes on indefinitely or stops at the vertex r. 
However, this walking process cannot go on forever (since the chain of in- 


equalities j (v) < j(v’) < j(v”) < -- - would force the numbers j (v) ‚j (v’),j(v"),... 


to be all distinct, but there are only m distinct numbers in {1,2,...,m}). Thus, 
it must stop at the vertex r. So we have found a walk from v to r using last-exit 
arcs only. Thus, Exite has a walk from v to r. Hence, Exite has a path from v 
tor. 

Forget that we fixed v. We thus have shown that for each vertex v € V, 
there is a path from v to r in the digraph (V, Exite, Y |Exite). In other words, 
r is a to-root of Exite. Hence, we conclude (using the implication A’2=—>A’l1 
in Theorem that Exite is an arborescence rooted to r (since |Exite| = 
|V| — 1). Therefore, Exite is a sparb. This proves Claim 1.] 


[Proof of Claim 2: Let B be a sparb. (As before, B is a set of arcs, and we 
identify it with the spanning subdigraph (V, B, y |z).) 


We must prove that there are exactly Į] (degt u — 1)! many a-Eulerian cir- 
ueV 
cuits e such that Exite = B. 


We shall refer to the arcs in B as the B-arcs. Recall that B is an arborescence 
rooted to r (since B is a sparb). Hence, by the implication A’1— => A’6 in Theorem 
5.10.5} we see that the outdegrees of its vertices satisfy 


dest r = 0, and deg} v = 1 forallv € V \ fr 
SB SB 


(where deg} v means the outdegree of a vertex in the digraph (V,B, 4 |z)). 
In other words, there is no B-arc with source r; however, for each vertex u € 
V \ {r}, there is exactly one B-arc with source u. 

Now, we are trying to count the a-Eulerian circuits e such that Exite = B. 
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Let us try to construct such an a-Eulerian circuit e as follows: 

A turtle wants to walk through the digraph D using each arc of D at most 
once. It starts its walk by heading out from the vertex r along the arc a. From 
that point on, it proceeds in the usual way you would walk on a digraph: Each 
time it reaches a vertex, it chooses an arbitrary arc leading out of this vertex, 
observing the following two rules: 


1. It never uses an arc that it has already used before. 


2. It never uses a B-arc unless it has to (i.e., unless this B-arc is the only 
outgoing arc from its current position that is still unused). 


Clearly, the turtle will eventually get stuck at some vertex (with no more arcs 
left to continue walking along), since D has only finitely many arcs. 

Let w be the total walk that the turtle has traced by the time it got stuck. 
Thus, w is a trail (i.e., a walk that uses no arc more than once) that starts with 
the vertex r and the arc a. 

We will soon see that w is an a-Eulerian circuit satisfying Exitw = B. First, 
however, let us see an example: 


Example 5.11.2. Let D be the multidigraph 


and let r = 1 and a = a (we called it a on purpose). Let B be the set {d,e,h,k}, 
regarded as a spanning subdigraph of D. (The arcs of B are drawn bold and 
in red in the above picture.) 

The turtle starts at r = 1 and walks along the arc a. This leads it to the 
vertex 2. It now must choose between the arcs b and k, but since it must not 
use the B-arc k unless it has to, it is actually forced to take the arc b next. 
This brings it to the vertex 3. It now has to choose between the arcs c, g and 
h, but again the arc h is disallowed because it is not yet time to use a B-arc. 
Let us say that it takes the arc g. This brings it back to the vertex 3. Next, the 
turtle must walk along c (since g is already used, while the B-arc still must 
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wait until it is the only option). This brings it to the vertex 4. Its next step is 
to take the arc / to the vertex 1. From there, it follows the arc f to the vertex 
3. Now, it can finally take the B-arc h, since all the other outgoing arcs from 
3 have already been used. This brings it to the vertex 5. Now it has a choice 
between the arcs e, i and j, but the arc e is disallowed because it is a B-arc. 
Let us say it decides to use the arc j. This brings it to the vertex 2. From 
there, it takes the B-arc k to the vertex 4 (since it has no other options). From 
there, it continues along the B-arc d to the vertex 5. Now, it has to traverse 
the loop i, and then leave 5 along the B-arc e to come back to 1. At this point, 
the turtle is stuck, since it has nowhere left to go. The walk w we obtained is 
thus 
w = (1,4,2,b,3,¢,3,c,4,1,1, f,3,, 5, j,2,k,4,d,5,i,5,e,1). 


(Of course, other choices would have led to other walks.) 


Returning to the general case, let us analyze the walk w traversed by the 
turtle. 


e First, we claim that w is a closed walk (i.e., ends at r). 


[Proof: Assume the contrary. Let u be the ending point of w. Thus, u 
is the vertex at which the turtle gets stuck. Moreover, u Æ r (since we 
just assumed that w is not a closed walk). Hence, the walk w enters the 
vertex u more often than it leaves it (since it ends but does not start at u). 
In other words, the turtle has entered the vertex u more often than it has 
left it. However, since D is balanced, we have deg” u = deg u. The turtle 
has entered the vertex u at most deg” u times (because it cannot use an 
arc twice, but there are only deg” u many arcs with target u). Thus, it has 
left the vertex u less than deg” u times (because it has entered the vertex 
u more often than it has left it). Since deg” u = deg* u, this means that 
the turtle has left the vertex u less than deg* u times. Thus, by the time 
the turtle has gotten stuck at u, there is at least one outgoing arc from u 
that has not been used by the turtle. Therefore, the turtle is not actually 
stuck at u. This is a contradiction. Thus, our assumption was wrong, so 
we have proved that w is a closed walk.] 


In other words, w is a circuit. We shall next show that w is an Eulerian 
circuit. 

To do so, we introduce one more piece of notation: A vertex u of D will be 
called exhausted if the turtle has used each outgoing arc from u (that is, if each 
outgoing arc from u is used in the circuit w). 

Since w is a circuit, the ending point of w is its starting point, i.e., the vertex 
r. Thus, the turtle must have gotten stuck at r. Hence, the vertex r is exhausted. 


e We shall now show that all vertices of D are exhausted. 
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[Proof: Assume the contrary. Thus, there exists a vertex u of D that is 
not exhausted. Consider this u. But B is a sparb, thus an arborescence 
rooted to r. Hence, r is a to-root of B. Therefore, there exists a path 
P = (po, bi, p1, ba, P2, . - +, bx, pk) from u to r in B. Consider this path. Thus, 
we have po = u and p = r, and all the arcs bj, b2,..., bg belong to B. 


There exists at least one i € {0,1,...,k} such that the vertex p; is ex- 
hausted (for instance, i = k qualifies, since p = r is exhausted). Consider 
the smallest such 7. Then, p; Æ po (since p; is exhausted, but po = u is 
not). Hence, i Æ 0, so that i > 1. Therefore, p;_; exists. Moreover, the ver- 
tex pj;_1 is not exhausted (since i was defined to be the smallest element 
of {0,1,...,k} such that p; is exhausted). 


The arc b; has source p;_, and target p;. Thus, it is an outgoing arc from 
pi_-1 and incoming arc into p;. Furthermore, it belongs to B (since all the 
arcs b,,b2,...,b; belong to B). 


The digraph D is balanced; thus, deg* (p;) = deg” (pj). 


The vertex p; is exhausted. In other words, the turtle has used each out- 
going arc from p; (by the definition of “exhausted”). Since the turtle never 
reuses an arc, this entails that the turtle has used exactly deg* (p;) many 
outgoing arcs from p; (since deg? (p;) is the total number of outgoing 
arcs from p; in D). In other words, it has used exactly deg” (p;) many 
outgoing arcs from p; (since deg* (p;) = deg” (p;)). 


However, the turtle’s trajectory is a closed walk (in fact, it is the walk w, 
which is closed). Thus, it must enter the vertex p; as often as it leaves this 
vertex. In other words, the number of incoming arcs into p; used by the 
turtle must equal the number of outgoing arcs from p; used by the turtle. 
Since we just found (in the preceding paragraph) that the latter number 
is deg” (p;), we thus conclude that the former number is deg” (p;) as 
well. In other words, the turtle must have used exactly deg” (p;) many 
incoming arcs into p;. Since deg (p;) is the total number of incoming arcs 
into p; in D, we thus conclude that the turtle must have used all incoming 
arcs into p; (since the turtle never reuses an arc). 


Hence, in particular, the turtle must have used the arc b; (since b; is an 
incoming arc into p;). This arc b; is an outgoing arc from pj;_;. But b; is 
a B-arc, and thus our turtle uses this arc only as a last resort (ie., after 
using all other outgoing arcs from pj_1). Hence, we conclude that the 
turtle must have used all outgoing arcs from pj_, (since it has used Jj). 
In other words, p;_1 is exhausted. But this contradicts the fact that p;—1 
is not exhausted! This shows that our assumption was wrong, and our 
proof is finished #7]. ] 


47For the sake of diversity, let me sketch a second proof of the same claim (i.e., that all vertices 
in D are exhausted): 
Assume the contrary. Thus, there exists a non-exhausted vertex u of D. Consider this u. 
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Thus, we have shown that all vertices of D are exhausted. In other words, 
the turtle has used all arcs of D. In other words, the trail w contains all arcs of 
D. Since w is a trail and a closed walk, this entails that w is an Eulerian circuit 
of D. Since w starts with r and a, this shows further that w is an a-Eulerian 
circuit. Since the turtle only used B-arcs as a last resort (and it used each B-arc 
eventually, because w is Eulerian), we have Exit w = B. 

Thus, the turtle’s walk has produced an a-Eulerian circuit e satisfying Exite = 
B (namely, the walk w). However, this circuit depends on some decisions the 
turtle made during its walk. Namely, every time the turtle was at some vertex 
u € V, it had to decide which arc to take next; this arc had to be an unused arc 
with source u, subject to the conditions that 


1. if u Æ r, then the B-ard44| has to be used last; 
2. if u =r, then the arc a has to be used first. 


Let us count how many options the turtle has had in total. To make the 
argument clearer, we modify the procedure somewhat: Instead of deciding ad- 
hoc which arc to take, the turtle should now make all these decisions before 
embarking on its journey. To do so, it chooses, for each vertex u € V, a total 
order on the set of all arcs with source u, such that 


1. if u Æ r, then the B-arc comes last in this order, and 


Then, u Æ r (since r is exhausted but u is not). Since u is not exhausted, there is at least 
one outgoing arc from u that the turtle has not used. Hence, the turtle has not used the 
B-arc outgoing from u (since the turtle never uses a B-arc before it has to). Let f be this 
B-arc, and let u’ be its target. Thus, the turtle has not used all incoming arcs of u’ (because 
it has not used the arc f). As a consequence, it has not used all outgoing arcs from u’ either 
(because the turtle has left u’ as often as it has entered u’, but the balancedness of D entails 
that deg” (u’) = degt (u’)). In other words, the vertex u’ is non-exhausted. 

Thus, by starting at the non-exhausted vertex u and taking the B-arc outgoing from u, 
we have arrived at a further non-exhausted vertex u’. Applying the same argument to u’ 
instead of u, we can take a further B-arc and arrive at a further non-exhausted vertex u”. 
Continuing like this, we obtain an infinite sequence (u, u’, u”, ...) of non-exhausted vertices 
such that any vertex in this sequence is reached from the previous one by traveling along a 
B-arc. Clearly, this sequence must have two equal vertices (since D has only finitely many 
vertices). For example, let’s say that u” = u”. Then, if we consider only the part of the 
sequence between u” and u”, then we obtain a closed walk 


(u", *, u”, *, ui *, u”) ; 
where each asterisk stands for some B-arc (not the same one, of course). This is a closed 
walk of the digraph (V, B, yọ |g). Since this closed walk has length > 0, it cannot be a path; 
therefore, it contains a cycle (by Proposition [4.5.9). Thus, we have found a cycle of the 
digraph (V,B, |g). However, the digraph (V, B, 4 |g) is an arborescence, and thus has no 
cycles (because if D is an arborescence, then any cycle of D would be a cycle of D¥"4; but 
the multigraph D""4 has no cycles by the definition of an arborescence). The previous two 
sentences contradict each other. This shows that our assumption was wrong, and our proof 
is finished. 
48We say “the B-arc”, because there is exactly one B-arc with source u. 
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2. if u = r, then the arc a comes first in this order. 


Note that this total order can be chosen in (deg* u — 1)! many ways (since 
there are deg’ u arcs with source u, and we can freely choose their order except 
that one of them has a fixed position). Thus, in total, there are J] (deg* u — 1)! 

ueV 
many options for how the turtle can choose all these orders. Once these orders 
have been chosen, the turtle then uses them to decide which arcs to walk along: 
Namely, the first time it visits the vertex u, it leaves it along the first arc (ac- 
cording to its chosen order); the second time, it uses the second arc; the third 
time, the third arc; and so on. 

So the turtle has [] (deg* u —1)! many options, and each of these options 

ucV 


leads to a different a-Eulerian circuit e (because the total orders chosen by the 
turtle are reflected in e: they are precisely the orders in which the respective 


arcs appear in e). Moreover, each a-Eulerian circuit e satisfying Exite = B 


comes from one of these option] 


Therefore, the total number of a-Eulerian circuits e satisfying Exite = B is the 
total number of options, which is JĮ (deg* u—1)! as we know. This proves 
ueV 


Claim 2.] 
With Claims 1 and 2 proved, we are almost done. The map 


{a-Eulerian circuits of D} — {sparbs}, 
e +> Exite 


is well-defined (by Claim 1). Furthermore, Claim 2 shows that this map is a 
JI (deg* u — 1)!-to-1 correspondencd4 (i.e., each sparb B has exactly 
ucV 


II (deg* u — 1)! many preimages under this map). Thus, by the multijection 
ueV 


principld4} we conclude tha? 


(# of a-Eulerian circuits of D) = (1 (deg* u—1) ) - (# of sparbs) . 


uEeV 


49 Proof. Let e be an a-Eulerian circuit satisfying Exite = B. Then, by choosing the appropriate 
total orders ahead of its journey, the turtle will trace this exact circuit e. (Of course, the 
“appropriate total orders” are the ones dictated by e: That is, for each vertex u € V, the 
turtle must pick the same total order on the set of all arcs with source u in which they appear 
on e. This choice is legitimate, because the arc a is the first arc of e (so it will certainly come 
first in its order), and because each B-arc appears in e after all other arcs from the same 
source have appeared (so it will come last in its total order).) 

5°An m-to-1 correspondence (where m is a nonnegative integer) means a map f : X > Y 
between two sets such that each element of Y has exactly m preimages under f. 

51The multijection principle is a basic counting principle that says the following: Let X and Y 
be two finite sets, and let m € IN. Let f : X — Y be an m-to-1 correspondence (i.e., a map 
such that each element of Y has exactly m preimages under f). Then, |X| = m- |Y]. 

For example, n (intact) sheep have 4n legs in total, since the map that sends each leg to 
its sheep is a 4-to-1 correspondence. 

52The symbol “#” means “number”. 
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Since e (D,a) = (# of a-Eulerian circuits of D) and t (D,r) = (# of sparbs), we 
can rewrite this as follows: 


e(D,a) = (1 (degu —1)) -t (D,r) = t (D,r) - ] [ (deg? u—1)!. 


ueV uEeV 
This proves Theorem [5.10.4 O 


Proof of Theorem As we already mentioned, Theorem follows from 
Theorem [.10.4|by reversing each arc (i.e., by applying Theorem |5.10.4/ to the 
digraph D*Y instead of D). O 


5.12. A corollary about spanning arborescences 


Before we actually use the BEST (or BEST’) theorem to count the Eulerian cir- 
cuits on any digraph, let us mention a neat corollary for the number of spanning 
arborescences: 


Corollary 5.12.1. Let D = (V,A,w) be a balanced multidigraph. For each 
vertex r € V, let tT(D,r) be the number of spanning arborescences of D 
rooted to r. Then, t (D,r) does not depend on r. 


Proof of Corollary [5.12.1] WLOG assume that |V| > 1 (else, the claim is obvious). 
If there is a vertex v € V with deg? v = 0, then this vertex v satisfies deg” v = 0 
as well (since the balancedness of D entails deg” v = deg* v = 0), and therefore 
D has no spanning arborescences at all (since any spanning arborescence would 
have an arc with source or target v). Thus, we WLOG assume that deg* v > 0 
for all v € V. In other words, each vertex has outdegree > 0. 

Let r and s be two vertices of D. We must prove that t (D,r) = t (D,s). 

Pick an arc a with source r. (This exists, since degt r > 0.) Pick an arc b with 
source s. (This exists, since deg* s > 0.) 

Applying the BEST’ theorem (Theorem [5.10.4), we get 


e(D,a) = t(D,r)- [| (deg* u—1)! and similarly 
ucV 

e(D,b) = t(D,s)- [| (deg* u—1)!. 
ucV 


However, e (D,a) = e (D, b), since counting Eulerian circuits that start with a is 
equivalent to counting Eulerian circuits that start with b (because an Eulerian 
circuit can be rotated uniquely to start with any given arc). Thus, we obtain 


t(D,r)- |] (deg* u —1)! =e(D,a) =e(D,b) = t(D,s)- |] (deg™ u — 1)!. 


uEeV uEeV 


Cancelling the (nonzero!) number [] (deg* wu — 1)! from this equality, we ob- 
ueV 
tain t (D,r) = t (D,s). This proves Corollary [5.12.1 oO 
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5.13. Spanning arborescences vs. spanning trees 


The BEST theorem (Theorem|5.10.4Jor Theorem [5.9.1) connects the # of Eulerian 
circuits in a digraph with the # of spanning arborescences of the same digraph. 
Now let us try to find a way to compute the latter. 

For example, let us try to do this for digraphs of the form where G is a 
multigraph. I claim that the spanning arborescences of Gdi rooted to a given 
vertex r are just the spanning trees of G in disguise: 


Gbidir 


Proposition 5.13.1. Let G = (V,E,g) be a multigraph. Fix a vertex r € V. 
Recall that the arcs of Gi" are the pairs (e,i) € E x {1,2}. Identify each 
spanning tree of G with its edge set, and each spanning arborescence of 
Gbidir with its arc set. 


If B is a spanning arborescence of GPidir 


rooted to r, then we set 
B:= {e | (e,i) € B}. 


(Recall that we are identifying spanning arborescences with their arc sets, so 
that “(e,i) € B” means “(e,i) is an arc of B”.) 
Then: 


(a) If B is a spanning arborescence of Gid" rooted tor, then B is a spanning 
tree of G. 


(b) The map 
{spanning arborescences of G4" rooted to r} — {spanning trees of G}, 
BB 


is a bijection. 
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Example 5.13.2. Here is a multigraph G (on the left) with the corresponding 
multidigraph Gbidir (on the right): 


Here is a spanning arborescence B of Ghidir rooted to 1, and the correspond- 
ing spanning tree B of G: 


(here, the arcs of Gbidir that don’t belong to B, as well as the edges of G that 
don’t belong to B, have been drawn as dotted arrows). It is fairly easy to see 
how B can be reconstructed from B: You just need to replace each edge of B 
by the appropriately directed arc (namely, the one that is “directed towards 
1”). 


Proof of Proposition [5.13.1| This is an exercise in yak-shaving (and we have, in 
fact, shaved a very similar yak in Section 5.7} the only difference is that we are 


An introduction to graph theory, version August 2, 2023 page 215 


no longer dealing with trees in isolation, but rather with spanning trees of G). 


(a) Let B be a spanning arborescence of G'd! rooted to r. Then, BY is a tree 
(by the implication A’l=>A’3 in Theorem |5.10.5). However, it is easy to see 
that BU © B as multigraphs (indeed, each vertex v of B™4 corresponds to the 
same vertex v of B, whereas any edge (e,i) of BY4 corresponds to the edge e 
of B) Thus, B is a tree (since BY is a tree 4, therefore a spanning tree of 
G (since B is clearly a spanning subgraph of G). This proves Proposition [5.13.1 
(a). 


(b) We must prove that this map is surjective and injective. 

Surjectivity: Let T be a spanning tree of G. Then, the multidigraph T"? 
(defined in Definition 5.7.4) is an arborescence rooted from r (by Lemma B.7.7). 
Reversing each arc in this arborescence T"?, we obtain a new multidigraph 
T’*, which is thus an arborescence rooted to r. Unfortunately, T’~ is not a 
subdigraph of GPidit, for a rather stupid reason: The arcs of T’* are elements 
of E, whereas the arcs of Git are pairs of the form (e,i) with e € E and 
i € {1,2}. 

Fortunately, this is easily fixed: For each arc e of T”®©, we let e’ be the arc 
(e,i) of Gd that has the same source as e (and thus the same target as e). This 
is uniquely determined, since the arcs (e,1) and (e,2) of GPd! have different 
sourced), If we replace each arc e of T" by the corresponding arc e’ of GPidir, 
then we obtain a spanning subdigraph S of Gd! that is an arborescence rooted 
to r (since T’* is an arborescence rooted to r, and we have only replaced its 
arcs by equivalent ones with the same sources and the same targets). In other 
words, we obtain a spanning arborescence S of Gdi! rooted to r. It is easy to 
see that S = T. Hence, the map 


{spanning arborescences of G'S" rooted to r} — {spanning trees of G}, 
B>B 


Here we need to use the fact that for each edge e of B, exactly one of the two pairs (e,1) and 
(e,2) is an edge of B™4, But this is easy to check: At least one of the two pairs (e,1) and 
(e,2) must be an arc of B (since e is an edge of B). In other words, at least one of the two 
pairs (e,1) and (e,2) must be an edge of B™4d, But both of these pairs cannot be edges of 
Bund at the same time (since this would create a cycle, but Bund is a tree and thus has no 
cycles). Hence, exactly one of these pairs is an edge of B™4, ged. 

54 Alternatively, you can prove this as follows: The vertex r is a to-root of B (since B is an 
arborescence rooted to r). Thus, for each v € V, there is a path from v to r in B. By “project- 
ing” this path onto B (that is, replacing each arc (e,i) of this path by the corresponding edge 
e of B), we obtain a path from v to r in B. This shows that the multigraph B is connected. 
Furthermore, the definition of B shows that |B| < |B| = |V| — 1 (by Statement A’2 in The- 
orem [5.10.5] since B is an arborescence rooted to r). Hence, |B| < |V|. Thus, we can apply 
the implication T5=>T1 of the Tree Equivalence Theorem (Theorem|5.2.4) to conclude that 
B is a tree. 

Proof. The edge e of T is not a loop (because T is a tree, but a tree cannot have any loops). 
Hence, its two endpoints are distinct. Thus, the arcs (e,1) and (e,2) of G'd! have different 
sources (since their sources are the two endpoints of e). 
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sends S to T. This shows that T is a value of this map. Since we have proved this 
for every spanning tree T of G, we have thus shown that this map is surjective. 

Injectivity: The main idea is that, in order to recover a spanning arborescence 
B back from the corresponding spanning tree B, we just need to “orient the 
edges of the tree towards r”. Here are the (annoyingly long) details: 

Let B and C be two sparb}4 such that B = C. We must show that B = C. 

Assume the contrary. Thus, B Æ C. Let T be the tree B = C. Thus, each edge 
e of T corresponds to either an arc (e,1) or an arc (e,2) in B (since T = B), and 
likewise for C. Conversely, each arc (e,i) of B or of C corresponds to an edge e 
of T. Hence, from B Æ C, we see that there must exist an edge e of T such that 


e either we have (e,1) € B and (e,2) € C, 


e or we have (e,1) € C and (e,2) € B. 


Consider this edge e. We WLOG assume that (e,1) € B and (e,2) € C (else, 
we can just swap B with C). Let the arc (e, 1) of GPÌĦ! have source s and target 
t, so that (e,2) has source t and target s. The edge e thus has endpoints s and t. 

Since B is an arborescence rooted to r, the vertex r is a to-root of B. Hence, 
there exists a path p from s to r in B. This path p must begin with the arc (e,1) 
Projecting this path p down onto T, we obtain a path p from s to r in T. (By 
the word “projecting”, we mean replacing each arc (e,i) by the corresponding 
edge e. Clearly, doing this to a path in B yields a path in T, because T = B.) 
Since the path p begins with the arc (e, 1), the “projected” path p begins with 
the edge e. Thus, in the tree T, the path from s to r begins with the edge e 
(because this path must be the path p). As a consequence, t must be the second 
vertex of this path (since the edge e has endpoints s and t), so that removing the 
first edge from this path yields the path from t to r. Thus, d (t,r) = d (s,r) — 1, 
where d denotes distance on the tree T. Hence, d (t,r) < d (s,r). 

A similar argument (but with the roles of B and C swapped, as well as the 
roles of s and t swapped, and the roles of (e,1) and (e,2) swapped) shows that 
d (s,r) < d (t,r). But this contradicts d (t,r) < d (s,r). 

This contradiction shows that our assumption was false. Thus, we have 
proved that B = C. 


5¢Henceforth, “sparb” is short for “spanning arborescence of Gdi" rooted to r”. 
57 Proof. Since r is a to-root of B, we know that there exists a path from t to r in B. Let t be this 
path. Extending this path t by the vertex s and the arc (e, 1) (which we both insert at the start 
of t), we obtain a walk t’ from s tor in B. (So, if t = (t, ..., r), then t = (s, (e,1), t, ..., r).) 
However, B is an arborescence rooted to r. Thus, Statement A’4 in the Dual Arborescence 
Equivalence Theorem (Theorem[5.10.5) shows that for each vertex v € V, the digraph B has 
a unique walk from v to r. Hence, in particular, B has a unique walk from s to r. Thus, 
p =U (since both p and t’ are walks from s to r in B). Since t’ begins with the arc (e,1), we 
thus conclude that p begins with the arc (e, 1). 
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Forget that we fixed B and C. We thus have shown that if B and C are two 
sparbs such that B = C, then B = C. In other words, our map 
{spanning arborescences of GPi4i* rooted to r} — {spanning trees of G}, 
B>B 
is injective. 
We have now shown that this map is both surjective and injective. Hence, it 
is a bijection. This proves Proposition |5.13.1) (b). O 


5.14. The matrix-tree theorem 
5.14.1. Introduction 


So counting spanning trees in a multigraph is a particular case of counting 
spanning arborescences (rooted to a given vertex) in a multidigraph. But how 
do we do either? Let us begin with some simple examples: 


Example 5.14.1. There is only one spanning tree of the complete graph Kı: 


| 


There is only one spanning tree of the complete graph K2: 


There are 3 spanning trees of the complete graph K3: 


(They are all isomorphic, but still distinct.) 
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There are 16 spanning trees of the complete graph K4: 


(There are only two non-isomorphic ones among them.) 


This example suggests that the # of spanning trees of a complete graph Kn is 
no, 
This is indeed true, and we will prove this later. For now, however, let us 
address the more general problem of counting spanning arborescences of an 
arbitrary digraph D. 


5.14.2. Notations 


First, we introduce a notation: 
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Definition 5.14.2. We will use the Iverson bracket notation: If A is any 
logical statement, then we set 


1, if A is true; 
A a 7 y 
[Al te if A is false. 


For example, [K3 is a tree] = 1 whereas |K; is a tree] = 0. 
Definition 5.14.3. Let M be a matrix. Let i and j be two integers. Then, 


M;,; will mean the entry of M in row i and column j; 


M.j,~; will mean the matrix M with row i removed and column j removed. 


For example, 


ab c abc a 
def =f and de f = ( ) ; 
; g h 

g hi 23 ghi 


5.14.3. The Laplacian of a multidigraph 
We shall now assign a matrix to (more or less) any multidigraph 4 


Definition 5.14.4. Let D = (V, A, y) be a multidigraph. Assume that V = 
{1,2,...,n} for some n € N. 

For any i,j € V, we let a;; be the # of arcs of D that have source i and 
target j. 

The Laplacian of D is defined to be the n x n-matrix L € Z"*" whose 
entries are given by 


Lij = (degTi)- i=j] =a; for all i,j € V. 


This is also 
known as 6; j 


In other words, it is the matrix 


+ 
deg’ 1 — a11 —a2 pa —A1n 
+ 
—A2 1 deg 2— 42,2 aes —A2,n 
L= l 
— 
—An,1 —an2 “++ deg™ n— ann 


8Recall that the symbol “#” means “number”. 
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Example 5.14.5. Let D be the digraph 


Then, its Laplacian is 


2—1 1 0 1 
0 1-0 1 =| 0 
0 0 1-1 0 


One thing we notice from this example is that loops do not matter at all to 
the Laplacian L. Indeed, a loop with source i and target i counts once in deg* i 
and once in a; ;, but these contributions cancel out. 

Here is a simple property of Laplacians: 


Proposition 5.14.6. Let D = (V, A, y) be a multidigraph. Assume that V = 
{1,2,...,n} for some positive integer n. 
Then, the Laplacian L of D is singular; i.e., we have det L = 0. 


Proof. The sum of all columns of L is the zero vector, because for each i € V we 
have 


Ms 
D 
Il 
T 
ip) 
ag 


k i)-[i=j]- i) (by the definition of L) 


T 
pà 

Sa 
Il 
pany 


| 
M= 
— 

Q 

(ge) 
ga 


ti)-fi=jl- a 
j=1 


ja 
S o —— 
=deg" i =deg* i 
(since only the addend (since this is counting 
for j=i can be nonzero) all arcs with source i) 
= degti-deg*i=0. 
In other words, we have Le = 0 for the vector e := (1,1,.. ah, Thus, this 


vector e lies in the kernel (aka nullspace) of L, and so L is singular. 
(Note that we used the positivity of n here! If n = 0, then e is the zero vector, 
because a vector with 0 entries is automatically the zero vector.) o 


5.14.4. The Matrix-Tree Theorem: statement 


Proposition [5.14.6| shows that the determinant of the Laplacian of a digraph is 
not very interesting. It is common, however, that when a matrix has determi- 
nant 0, its largest nonzero minors (= determinants of submatrices) often carry 
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some interesting information; they are “the closest the matrix has” to a nonzero 
determinant. In the case of the Laplacian, they turn out to count spanning ar- 
borescences: 


Theorem 5.14.7 (Matrix-Tree Theorem). Let D = (V, A, y) be a multidigraph. 
Assume that V = {1,2,...,n} for some positive integer n. 
Let L be the Laplacian of D. Let r be a vertex of D. Then, 


(# of spanning arborescences of D rooted to r) = det (Lx;~;). 


Before we prove this, some remarks: 


e The determinant det (L~;,~,) is the (r,r)-th entry of the adjugate matrix 


of L. 


e The V = {1,2,...,n} assumption is a typical “WLOG assumption”: If 
you have an arbitrary digraph D, you can always rename its vertices as 
1,2,...,n, and then this assumption will be satisfied. Thus, Theorem 
5.14.7] helps you count the spanning arborescences of any digraph. That 
said, you can also drop the V = {1,2,...,n} assumption from Theorem 
5.14.7]if you are okay with matrices whose rows and columns are indexed 
not by numbers but by elements of an arbitrary finite se 


5.14.5. Application: Counting the spanning trees of K, 


Now, let us use the Matrix-Tree Theorem to count the spanning trees of K„. This 
should provide some intuition for the theorem before we come to its proof. 

We fix a positive integer n. Let L be the Laplacian of the multidigraph KPidir 
(where Kn, as we recall, is the complete graph on the set {1,2,...,n}). Then, 
each vertex of Kbidit has outdegree n — 1, and thus we have 


a er | 
a err 
Sf il +e: qed 


(this is the n x n-matrix whose diagonal entries are n — 1 and whose off-diagonal 
entries are —1). By Proposition |5.13.1) (b) (applied to G = Ky and r = 1), 
there is a bijection between {spanning arborescences of Kbidir rooted to 1} and 


59Such matrices are perfectly fine, just somewhat unusual and hard to write down (which row 


do you put on top?). See https : //mathoverflow.net/questions/317105 for details. 
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{spanning trees of K, }. Hence, by the bijection principle, we have 
(# of spanning trees of Kn) 


= (# of spanning arborescences of K>!4 rooted to 1) 


= det (Lx1,01) (by Theorem 5.14.7] applied to D = K?'4'* and r = 1) 
jo =i ae ET 
=i p=] ss =i 
= det 
= el cae a 


a 
an (n—1)x (n—1)-matrix 


How do we compute this determinant? Here are three ways: 


e The most elementary approach is using row transformations: 


gat. =] s> =l 
a gel e ef 
det ; 
—1 —1 n—1 
n—1 1 1 1 —1 
—n n 0 O >œ O 
a 0 n 0o- 0 here, we have 
= det si 0 0 n 0 subtracted the 1st row 


from each other row 


| 
a 
© 
© 
a 


n—1 —1 —1 -1 —1 
—1 1 0 0 or 0 here, we have 
1 0 1 0- 0 factored out 
=n" °det| _1 9 0 1 0 an n from each 
; l ; row except for 
the first row 
—1 0 0 0 1 
1 00 0 0 
—1 10 0 0 
— n"? det -1 0 10 0 here, we have added the 2nd, 
-1 0 0 1 0 3rd, etc. rows to the 1st row 


—1 000+. 1 
Se, e 


=1 
(since the matrix is triangular 
with diagonal entries 1,1,...,1) 
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e The so-called matrix determinant lemma says that for any m x m-matrix 
A € R™*™, any column vector u € IR™*! and any row vector v € R!*™, 
we have 

det (A+ uv) = det A +v (adj A) u. 


This helps us compute our determinant, since 


n=] =f e =] 
—1 n-1 vee ea 
= =j n—1 
n O 0 —] 
On 0 —1 
= . |+ (11> 1). 
: <———— ama 
00- n —1 T 
s_—_—" Cl 
=A =u 


e Here is an approach that is heavier on linear algebra (specifically, eigen- 
vectors and eigenvalued): 


Let (e1,€2,.--,€n—1) be the standard basis of the IR-vector space IR’~! (so 
that e; is the column vector with its i-th coordinate equal to 1 and all its 
other coordinates equal to 0). Then, we can find the following n — 1 eigen- 
n=1 =1 a =] 
ai #31 # 1 
vectors of our (n — 1) x (n — 1)-matrix 


=i <i + =l 
— the n — 2 eigenvectors e1 — e; for alli € {2,3,...,n — 1}, each of them 
with eigenvalue n (check this!); 


— the eigenvector e1 + e2 +--+ + en—1 with eigenvalue 1 (check this!). 


Since these n — 1 eigenvectors are linearly independent (check this!), they 
form a basis of R"~!. Hence, our matrix is similar to the diagonal matrix 
with diagonal entries n,n,...,n,1 (by [Ireil17| Chapter 4, Theorem 2.1]), 
— 
n—2 times 
and therefore has determinant nn---n1=n"~2. 
— 


n—2 times 


There are other ways as well. Either way, the result we obtain is n"~2. Thus, 
we have proved (relying on the Matrix-Tree Theorem, which we haven’t yet 
proved): 


60See [Treil17 Chapter 4] for a refresher. 
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Theorem 5.14.8 (Cayley’s formula). Let n be a positive integer. Then, the # 
of spanning trees of the complete graph K, is n"~?. 


In other words: 


Corollary 5.14.9. Let n be a positive integer. Then, the # of simple graphs 
with vertex set {1,2,...,n} that are trees is n”~?. 


Proof. This is just Theorem |5.14.8|} since the simple graphs with vertex set 
{1,2,...,n} that are trees are precisely the spanning trees of Ky. O 


There are many ways to prove Cayley’s formula (Theorem |5.14.8). I can par- 
ticularly recommend the two combinatorial proofs given in §2.4 and 
§2.5], as well as Joyal’s proof sketched in [Leinst19]. Most textbooks on enu- 
merative combinatorics give one proof or another; e.g., Appendix to 
Chapter 9] gives three. Cayley’s formula also appears in Aigner’s and Ziegler’s 
best-of compilation of mathematical proofs Chapter 33] with four 
different proofs. Note that some of the sources use a matrix-tree theorem for 
undirected graphs; this is a particular case of our matrix-tree theorem|S] 

However, in order to complete our proof, we still need to prove the Matrix- 
Tree Theorem. 


5.14.6. Preparations for the proof 


In order to prepare for the proof of the Matrix-Tree Theorem, we state a simple 
lemma (yet another criterion for a digraph to be an arborescence): 


Lemma 5.14.10. Let D = (V, A, y) be a multidigraph. Let r be a vertex of 
D. Assume that D has no cycles. Assume moreover that D has no arcs with 
source r. Assume furthermore that each vertex v € V \ {r} has outdegree 1. 
Then, the digraph D is an arborescence rooted to r. 


This lemma is precisely Exercise (b), at least after reversing all arcs. But 
let us give a self-contained proof here: 


6lOne more remark: In Corollary [5.14.9] we have counted the trees with n vertices (i.e., simple 
graphs with vertex set {1,2,...,n} that are trees). It sounds equally natural to count the 
“unlabelled trees with n vertices”, i.e., the equivalence classes of such trees up to isomor- 
phism. Unfortunately, this is one of those “messy numbers” with no good expression: the 
best formula known is recursive. There is also an asymptotic formula (“Otter’s formula”, 
[Otter48]): the number of equivalence classes of n-vertex trees (up to isomorphism) is 


ia n 


~ bap 


with a ~ 2.955 and f ~ 0.5349. 
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Proof of Lemma Let u be any vertex of D. Let p = (Vo, 41,01, A2, V2, ..-,Ak, Ux) 
be a longest path of D that starts at u. Thus, vo = u. 

We shall show that vy = r. Indeed, assume the contrary. Thus, vg Æ r, so that 
vz, € V \ {r}. Hence, the vertex vy has outdegree 1 (since we assumed that each 
vertex v € V \ {r} has outdegree 1). Thus, there exists an arc b of D that has 
source vz. Consider this arc b, and let w be its target. Thus, appending the arc 
b and the vertex w to the end of the path p, we obtain a walk 


w= (vo, 41, V1, 42, U2, - » -Akr Ok, b, w) 


of D that starts at u (since v9 = u). Proposition shows that this walk w 
either is a path or contains a cycle. Hence, w is a path (since D has no cycles). 
Thus, w is a path of D that starts at u. Since w is longer than p (namely, longer 
by 1), this shows that p is not the longest path of D that starts at u. But this 
contradicts the very definition of p. 

This contradiction shows that our assumption was false. Hence, vg = r. Thus, 
p is a path from u to r (since vo = u and vg = r). Therefore, the digraph D has 
a path from u to r (namely, p). 

Forget that we fixed u. We thus have shown that for each vertex u of D, 
the digraph D has a path from u to r. In other words, r is a to-root of D. 
Furthermore, we have degt r = 0 (since D has no arcs with source r), and each 
v € V \ {r} satisfies deg* v = 1 (since we have assumed that each vertex v € 
V \ {r} has outdegree 1). In other words, the digraph D satisfies Statement A’6 
from the dual arborescence equivalence theorem (Theorem [5.10.5). Therefore, 
it satisfies Statement A’1 from that theorem as well (since all six statements A’1, 
A’2,..., A’6 are equivalent). In other words, D is an arborescence rooted to r. 
This proves Lemma [5.14.10 O 


5.14.7. The Matrix-Tree Theorem: proof 


We shall now prove the Matrix-Tree Theorem (Theorem [5.14.7), guided by the 
following battle plan: 


1. First, we will prove it in the case when each vertex v € V \ {r} has out- 
degree 1. In this case, after removing all arcs with source r from D (these 
arcs do not matter, since neither the submatrix D~r,~r nor the spanning ar- 
borescences rooted to r depend on them), we have essentially two options 
(subcases): either D is itself an arborescence or D has a cycle. 


2. Then, we will prove the matrix-tree theorem in the slightly more general 
case when each v € V \ {r} has outdegree < 1. This is easy, since a vertex 
v € V \ {r} having outdegree 0 trivializes the theorem. 


6Such a path clearly exists, since the length-0 path (u) is a path of D that starts at u, and since 
a path of D cannot have length larger than |V| — 1. 
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3. Finally, we will prove the theorem in the general case. This is done by 
strong induction on the number of arcs of D. Every time you have a 
vertex v € V \ {r} with outdegree > 1, you can pick such a vertex and 
color the outgoing arcs from it red and blue in such a way that each color 
is used at least once. Then, you can consider the subdigraph of D obtained 
by removing all blue arcs (call it D'°¢) and the subdigraph of D obtained 
by removing all red arcs (call it D*). You can then apply the induction 
hypothesis to D'®¢ and to D?! (since each of these two subdigraphs has 
fewer arcs than D), and add the results together. The good news is that 
both the # of spanning arborescences rooted to r and the determinant 
det (Lx+,.r) “behave additively” (we will soon see what this means). 


So let us begin with Step 1. We first study a very special case: 


Lemma 5.14.11. Let D = (V, A, y) be a multidigraph. Let r be a vertex of 
D. Assume that D has no cycles. Assume moreover that D has no arcs with 


source r. Assume furthermore that each vertex v € V \ {r} has outdegree 1. 
Then: 


(a) The digraph D has a unique spanning arborescence rooted to r. 


(b) Assume that V = {1,2,...,n} for some n € N. Let L be the Laplacian 
of D. Then, det (Larr) = 1. 


Proof. (a) Lemma [5.14.10] shows that the digraph D itself is an arborescence 
rooted to r. 

As a consequence, D itself is a spanning arborescence of D rooted to r. 

Therefore, |A| = |V| — 1 (by Statement A’2 in the Dual Arborescence Equiv- 
alence Theorem (Theorem 5.10.5)°). Hence, D has no spanning arborescences 
other than itself (because the condition |A| = |V| — 1 would get destroyed as 
soon as we remove an arc). So the only spanning arborescence of D rooted to r 
is D itself. This proves Lemma [5.14.11] (a). 


(b) We WLOG assume that r = n (otherwise, we can swap r with n, so that 
Lar, ~r becomes Lun, ~n). 

Let D’ be the digraph D with a loop added at each vertex - i.e., the multidi- 
graph obtained from D by adding n extra arcs 4, 42, . . . , n and letting each arc 
Li have source i and target i. 

Let S„—ı denote the group of permutations of the set 


{1,2,...,n— 1} = iaaa =V\{r}. 


=V =F 


or by the fact that |A| is the sum of the outdegrees of all vertices of D 
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Now, from r = n, we have 


n—1 
det (Lurr) =det(Lanan) = }, signo. ] | Lico (22) 
a 


TES y_4 i 


(by the Leibniz formula for the determinant). We shall now study the addends 
in the sum on the right hand side of this equality. Specifically, we will show that 


n—1 
the only addend whose product [J L;,.(;) is nonzero is the addend for ø = id. 
i=1 


n—1 
Indeed, let 7 € S„—ı be a permutation such that the product [J L;,(;) is 
i=1 


nonzero. We shall prove that o = id. 
Consider an arbitrary v € {1,2,...,n—1}. Then, L,o(y) 4 0 (because Ly e(o) 


n—1 
is a factor in the product [Į] Lio) which is nonzero). However, the definition 
i=1 


of L yields L, ¢(p) = (degt v) -[v =o (0)] =a Thus, 


(deg* v) [v = ø (v)] — Ay,o(v) = Ly o(v) # 0. 


Hence, at least one of the numbers [v = g (v)] and a, o(v) is nonzero. In other 
words, we have v = ¢ (v) (this is what it means for |v = ø (v)] to be nonzero) or 
the digraph D has an arc with source v and target ø (v) (because this is what it 
means for 4, ,(y) to be nonzero). In either case, the digraph D' has an arc with 
source v and target o (v) (because if v = ø (v), then one of the loops we added 
to D does the trick). We can apply the same argument to ø (v) instead of v, and 
obtain an arc with source ø (v) and target ø (ø (v)). Similarly, we obtain an arc 
with source øo (7 (v)) and target o (ø (7 (v))). We can continue this reasoning 
indefinitely. By continuing it for n steps, we obtain a walk 


(v,+,0 (0) 4,07 (v) ,*,07 (v),...,*,0" (0)) 


in the digraph D’, where each asterisk means an arc (we don’t care about what 
these arcs are, so we are not giving them names). This walk cannot be a path 
(since it has n + 1 vertices, but D’ has only n vertices); thus, it must contain 
a cycle (by Proposition /4.5.9). All arcs of this cycle must be loops (because 
otherwise, we could remove the loops from this cycle and obtain a cycle of D, 
but we know that D has no cycles). In particular, its first arc is a loop. Thus, our 
above walk (v,*,0 (v) ,*,07 (v), *, 0° (v),...,*,0"(v)) contains a loop (since 
the arcs of the cycle come from this walk). In other words, we have a! (v) = 
gitl (v) for some i € {0,1,...,n —1}. Since ø is injective, we can apply o~ 
to both sides of this equality, and conclude that v = ø (v). In other words, 
o (v) =v. 

Forget that we fixed v. We thus have shown that ø (v) = v for each v € 
{1,2,...,n— 1}. In other words, o = id. 


An introduction to graph theory, version August 2, 2023 page 228 


Forget that we fixed ø. We thus have proved that 7 = id for each permutation 


n—1 
o € S„—1 for which the product J] Lio) is nonzero. In other words, the 
i=1 


n—1 
only permutation 7 € Sp—1 for which the product [] L;oq) is nonzero is the 
i=1 


permutation id. 
Thus, the only nonzero addend on the right hand side of is the addend 
corresponding to 7 = id. Hence, can be simplified as follows: 


n—1 n—1 
det La Pe = sien id $ Lii A = Lii a 
(Lunn) = sign (id) I] iid(2) I] i,id(i) 


Since each i € {1,2,...,n — 1} satisfies 


aay = + . ee 
Liia) = Lii = (deg* i) -[i=iJ— dij 
=1 =0 
(since i has outdegree 1 (since D has no cycles 
(because each vertex ve V\ {r} has and thus cannot have 
outdegree 1, and we can apply this a loop with source i) 
to v=i since i€ {1,2,...n—1}=V\ {r})) 


(by the definition of L) 
Seitat=t 


this can be simplified to det (Linn) = [I 1 = 1. This proves Lemma [5.14.11 
i=1 
(b). oO 


Next, we drop the “no cycles” condition: 


Lemma 5.14.12. Let D = (V, A, y) be a multidigraph. Let r be a vertex of 
D. Assume that each vertex v € V \ {r} has outdegree 1. Then, the MTT 
holds for these D and r. (Here and in the following, “MTT” is short for 
“Matrix-Tree Theorem”, i.e., for Theorem [5.14.7]) 


Proof. First of all, we note that an arc with source r cannot appear in any 
spanning arborescence of D rooted to r (since any such arborescence satisfies 
deg* r = 0, according to Statement A’6 in the Dual Arborescence Equivalence 
Theorem (Theorem [5.10.5)). Furthermore, the arcs with source r do not affect 
the matrix L.,~,, since they only appear in the r-th row of the matrix L (but 
this r-th row is removed in Lw~;,~;). 

Hence, any arc with source r can be removed from D without disturbing 
anything we currently care about. Thus, we WLOG assume that D has no arcs 
with source r (else, we can just remove them from D). 

We WLOG assume that r = n (otherwise, we can swap r with n, so that Lar ~r 
becomes Lwin ~n). 
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We are in one of the following two cases: 

Case 1: The digraph D has a cycle. 

Case 2: The digraph D has no cycles. 

Consider Case 1. In this case, D has a cycle v = (01, *, U2, *,...,*,Um) (where 
we again are putting asterisks in place of the arcs). This cycle cannot contain 
r (since D has no arcs with source r). Thus, all its vertices v1, v2,.. ., Um belong 
to V \ {r}. Hence, for each i € {1,2,...,m—1}, the vertex v; has outdegree 1 
(since we assumed that each vertex v € V \ {r} has outdegree 1). Consequently, 
for each i € {1,2,...,m—1}, the only arc of D that has source v; is the arc that 
follows v; on the cycle v. Therefore, in the matrix L, the v;-th row has a 1 in 
the v;-th position (because deg* (vi) = 1), a —1 in the v;+,-th position (since 
the arc that follows v; on the cycle v has source v; and target vj), and Os in all 
other positions. Since r = n, the same must then be true for the matrix Lyr ~r: 
That is, the v;-th row of the matrix L.,,~, has a 1 in the v;-th position, a —1 in 
the vj41-th position, and Os in all other positions. Thus, the sum of the v1-th, 
V2-th, ..., Vm—p-th_rows of Larr is the zero vector (since the 1s and the —1s 
just cancel out al 

So we have found a nonempty set of rows of L.,,~, whose sum is the zero 
vector. This yields that the matrix L.,., is singular (by basic properties of 
determinants), so its determinant is det (L~;,~,) = 0. On the other hand, the 
digraph D has no spanning arborescence (because, in order to get a spanning 
arborescence of D, we would have to remove at least one arc of our cycle v 


64Namely, the —1 in the v;1-th position of the v;-th row gets cancelled by the 1 in the v;,,-th 
position of the vj;1-th row. (We are using the fact that Vm = v1 here.) 

Tet me illustrate this on a representative example: Assume that the numbers 
01, 02,+++,Um—1,Um are 1,2,...,m — 1,1 (respectively). Then, the first m — 1 rows of L look 
as follows: 


1 -1 
—1 1 
(where all the missing entries are zeroes). Thus, the sum of these m — 1 rows is the zero 
vector. The same is therefore true of the matrix L~r ~r (since the first m — 1 rows of the latter 
matrix are just the first m — 1 rows of L, with their r-th entries removed). 
The general case is essentially the same as this example; the only difference is that the 
relevant rows are in other positions. 
Specifically, we are using the following fact: “Let M be a square matrix. If there is a certain 
nonempty set of rows of M whose sum is the zero vector, then the matrix M is singular.”. 
To prove this fact, we let S be this nonempty set. Choose one row from this set, and 
call it the chosen row. Now, add all the other rows from this set to this one chosen row. 
This operation does not change the determinant of M (since the determinant of a matrix 
is unchanged when we add one row to another), but the resulting matrix has a zero row 
(namely, the chosen row) and thus has determinant 0. Hence, the original matrix M must 
have had determinant 0 as well. In other words, M was singular, qed. 
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(since an arborescence cannot have a cycle); but then, the source of this arc 
would have outdegree 0, and thus we could no longer find a path from this 
source to r, so we would not obtain a spanning arborescence). In other words, 


(# of spanning arborescences of D rooted to r) = 0. 


Comparing this with det (L.;~,) = 0, we conclude that the MTT holds in this 
case (since it claims that 0 = 0). Thus, Case 1 is done. 
Next, we consider Case 2. In this case, D has no cycles. Then, det (L~r ~r) = 1 


(by Lemma (b)) and 
(# of spanning arborescences of D rooted to r) = 1 (by Lemma [5.14.11] (a)) . 


Thus, the MTT boils down to 1 = 1, which is again true. 
So Lemma [5.14.12|is proved. O 


Next, we venture into a mildly greater generality: 


Lemma 5.14.13. Let D = (V, A, y) be a multidigraph. Let r be a vertex of D. 
Assume that each vertex v € V \ {r} has outdegree < 1. Then, the MTT (= 
Matrix-Tree Theorem) holds for these D and r. 


Proof. If each vertex v € V \ {r} has outdegree 1, then this is true by Lemma 
5.14.12 

Thus, we WLOG assume that this is not the case. Hence, some vertex v € 
V \ {r} has outdegree Æ 1. Consider this v. The outdegree of v is Æ 1, but also 
< 1 (by the hypothesis of the lemma). Hence, this outdegree must be 0. That 
is, there is no arc with source v. 

WLOG assume that r = n (otherwise, swap r with n). 

We have v # r. Hence, the digraph D has no path from v to r (since any such 
path would include an arc with source v, but there is no arc with source v). 

Therefore, D has no spanning arborescence rooted to r (because any such 
spanning arborescence would have to have a path from v to r). In other words, 


(# of spanning arborescences of D rooted to r) = 0. 


Also, det (L~r~r) = 0 (since the v-th row of the matrix L~r,~r is 0 (because 
there is no arc with source v)). So the MTT boils down to 0 = 0 again, and thus 


Lemma [9.14.13|is proved. E 
We are now ready to prove the MTT in the general case: 


Proof of Theorem [5.14.7] First, we introduce a notation: 


Let M and N be two n x n-matrices that agree in all but one row. 
That is, there exists some j € {1,2,...,n} such that for each i £ j, 
we have 

(the i-th row of M) = (the i-th row of N). 
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j j 
Then, we write M Ł N, and we let M+ N be the n x n-matrix that 
is obtained from M by adding the j-th row of N to the j-th row of M 
(while leaving all remaining rows unchanged). 


abc a b c ; 
For example, if M = | d e f | and Ns |d e f |, then MEN 
ghi g h i 
and 
5 a b c 
M+N= | d+d e+e’ f+f 
g h i 


A well-known property of determinants (the multilinearity of the determi- 
nant) says that if M and N are two n x n-matrices and j € {1,2,...,n} isa 


number such that M = N, then 
j 
det (m + N = det M + det N. 


Now, let us prove the MTT. We proceed by strong induction on the # of arcs 
of D. 

Induction step: Let m € N. Assume (as the induction hypothesis) that the 
MTT holds for all digraphs D that have < m arcs. We must now prove it for 
our digraph D with m arcs. 

WLOG assume that r = n (otherwise, swap r with n). 

If each vertex v € V \ {r} has outdegree < 1, then the MTT holds by Lemma 
5.14.13] Thus, we WLOG assume that some vertex v € V \ {r} has outdegree 
> 1. Pick such a vertex v. We color each arc with source v either red or blue, 
making sure that at least one arc is red and at least one arc is blue. (We can 
do this, since v has outdegree > 1.) All arcs that do not have source v remain 
uncolored. 

Now, let D'®¢ be the subdigraph obtained from D by removing all blue arcs. 
Then, D"°¢ has fewer arcs than D. In other words, D"°4 has < m arcs. Hence, 
the induction hypothesis yields that the MTT holds for D"°4. That is, we have 

(# of spanning arborescences of D™¢ rooted to r) = det es) ; 
where L"°4 means the Laplacian of D"°4, 

Likewise, let D'e be the subdigraph obtained from D by removing all red 
arcs. Then, D®!"° has fewer arcs than D. Hence, the induction hypothesis yields 
that the MTT holds for D>!“©. That is, 


~r, nr 


(# of spanning arborescences of D>!“* rooted to r) = det ae ) : 


where Le means the Laplacian of prve, 
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Example 5.14.14. Let D be the multidigraph 


with r = 1. Its Laplacian is 


0 1 —1 0 0 
L=j| -1 0 3 Sk -1 
0 0 0 1 -!1 
—1 0 0 0 1 


Let us pick v = 3 (this is a vertex with outdegree > 1), and let us color the 
arcs a and c red and the arcs b and d blue (various other options are possible). 
Then, D"4 and D'e look as follows (along with their Laplacians L'®¢ and 
blue), 
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Now, the digraphs D, pblve and D'4 differ only in the arcs with source v, 
and as far as the latter arcs are concerned, the arcs of D are divided between 
pblve and D"°d, Hence, by the definition of the Laplacian, we have 


pred 2 blue afd pred | pblue _ p, 


Thus, 
v v 
os = ee and pred p pbe — Luryor 


~r, ~or ~r, or 
(here, we have used the fact that r = n and v ¥ r, so that when we remove 


the r-th row and the r-th column of the matrix L, the v-th row remains the v-th 
row). Hence, 


det Lasis = det Gy 4 co = det GS ) + det Ges ) 
—— 


~r, or ~nr, ~nr 


—pred „4+ LPlue 


arer 


(by the multilinearity of the determinant). 
However, a similar equality holds for the # of spanning arborescences: namely, 
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we have 


(# of spanning arborescences of D rooted to r) 


= (+ of spanning arborescences of D'¢ rooted to r) 


af (# of spanning arborescences of D>'“* rooted to r) : 


Here is why: Recall that an arborescence rooted to r must satisfy deg’ v = 1 
(by Statement A’6 in the Dual Arborescence Equivalence Theorem (Theorem 
5.10.5), since v € V \ {r}). In other words, an arborescence rooted to r must 
contain exactly one arc with source v. In particular, a spanning arborescence of 
D rooted to r must contain either a red arc or a blue arc, but not both at the 
same time. In the former case, it is a spanning arborescence of D*e4. in the latter, 
it is a spanning arborescence of D®!¥*. Conversely, any spanning arborescence 
of D'4 or of D!¥* rooted to r is automatically a spanning arborescence of D 
rooted to r. Thus, 


(# of spanning arborescences of D rooted to r) 


= (# of spanning arborescences of D'*¢ rooted to r) 
a 
=det(L®4,) 
(as we saw above) 
+ (# of spanning arborescences of D”!"® rooted to r) 


2 
=det( Eu) 
(as we saw above) 


= det (es ) + det Ge = det (Lurr) 


~r, ~nr 


(since we proved that det (L~r,~r) = det (ES) + det (aP That is, the 


MTT holds for our digraph D and its vertex r. This completes the induction 
step, and thus the MTT (Theorem [b.14.7) is proved. O 


Our above proof of Theorem [5.14.7] has followed Theorem 10.4]. 
Other proofs can be found across the literature, e.g., in Theorem 
7], in [Margol10, Theorem 2.8], in Theorem 1] and in 
Theorem 2.5.3]. (Some of these sources prove more general versions of the 
theorem. Confusingly, each source uses different notations and works in a 
slightly different setup, although most of them quickly reveal themselves to be 
equivalent upon some introspection.) 


5.14.8. Further exercises on the Laplacian 
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Exercise 5.18. Let G = (V, E, ọ) be a multigraph. Let L be the Laplacian of 
the digraph Gdi", Prove that L is positive semidefinite. 


[Hint: Write L as NTN, where N or NT! is some matrix you have seen 
before. 

Note that the statement is not true if we replace Gid! by an arbitrary 
digraph D.] 


The following two exercises stand at the beginning of the theory of chip-firing 
and related dynamical systems on a digraph (see [CorPer18], and 
for much more). While the Laplacian is not mentioned in them 
directly, it is implicitly involved in the definition of a “donation” (how?). 


Exercise 5.19. Let D = (V, A, y) be a strongly connected multidigraph. 

A wealth distribution on D shall mean a family (k,),-y of integers (one 
for each vertex v € V). If k = (ky),<y is a wealth distribution, then we refer 
to each value k, as the wealth of the vertex v, and we define the total wealth 


of k to be the sum }, ky. We say that a vertex v is in debt in a given wealth 
vEV 
distribution k = (kv)„ey if its wealth ky is negative. 


For any vertices v and w, we let ayw denote the number of arcs that have 
source v and w. 

A donation is an operation that transforms a wealth distribution as fol- 
lows: We choose a vertex v, and we decrease its wealth by its outdegree 
deg* v, and then increase the wealth of each vertex w € V (including v itself) 
by av,w. (You can think of v as donating a unit of wealth for each arc that has 
source v. This unit flows to the target to this arc. Note that a donation does 
not change the total wealth.) 

Let k be a wealth distribution on D whose total wealth is larger than |A| — 
|V|. Prove that by an appropriately chosen finite sequence of donations, we 
can ensure that no vertex is in debt. 


[Example: For instance, consider the digraph 


with wealth distribution (kı, k2, k3, k4, ks,k6) = (—1,—1,1,2,0,1). The ver- 
tices 1 and 2 are in debt here, but it is possible to get all vertices out of debt 
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by having the vertices 4,5,6,1 donate in some order (the order clearly does 
not matter for the resul 17). 

Note that vertices are allowed to donate multiple times (although in the 
above example, this was unnecessary).] 


[Hint: A donation will be called safe if its donor v (that is, the vertex cho- 
sen to lose wealth) satisfies ky > deg* v, Where k is the wealth distribution 
just before this donation. Start by showing that if the total wealth is larger 
than |A| — |V], then at least one vertex v has wealth > deg* v (and thus can 
make a safe donation). Next, show that for any given wealth distribution k, 
there are only finitely many wealth distributions that can be obtained from 
k by a sequence of safe donations. Finally, for any vertex v, find a rational 
quantity that increases every time that a donor distinct from v makes a do- 
nation. Conclude that in a sufficiently long sequence of safe donations, every 
vertex must appear as a donor. But a donor of a safe donation must be out 
of debt just before its safe donation, and will never go back into debt.] 


Exercise 5.20. We continue with the setting and terminology of Exercise|5.19 

A clawback is an operation that transforms a wealth distribution as fol- 
lows: We choose a vertex v, and we increase its wealth by its outdegree 
deg‘ v, and then decrease the wealth of each vertex w € V (including v 
itself) by av w. (Thus, a clawback is the inverse of a donation.) 

Let k be a wealth distribution on D whose total wealth is larger than |A| — 
|V|. Prove that by an appropriately chosen finite sequence of clawbacks, we 
can ensure that no vertex is in debt. 


[Remark: Note that we are still assuming D to be strongly connected. 
Otherwise, the truth of the claim is not guaranteed. For instance, for the 


digraph 
Her 
ogo 


with wealth distribution (k1, k2, k3, k4) = (0,0, —1,2), no sequence of dona- 
tions and clawbacks will result in every vertex being out of debt (since the 
wealth difference k4 — k3 is preserved under any donation or clawback, but 
this difference is too large to come from a debt-free distribution with total 
weight 1). ] 


[Hint: Show that any donation is equivalent to an appropriately chosen 
composition of clawbacks. Something we know about the Laplacian may 
come useful here.] 


67 Depending on the order, some vertices will go into debt in the process, but this is okay as 
long as they ultimately end up debt-free. 
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5.14.9. Application: Counting Eulerian circuits of Kbidir 


Here is one more consequence of the MTT: 


Proposition 5.14.15. Let n be a positive integer. Pick any arc a of the multi- 
digraph Kid", Then, the # of Eulerian circuits of K>'4'" whose first arc is a is 
n™-2. (n— 2). 
Proof. Let r be the source of the arc a. The digraph K>'4i" is balanced, and each 
of its vertices has outdegree n — 1. By the BEST’ theorem (Theorem 5.10.4), we 


have 


(# of Eulerian circuits of K”! whose first arc is a) 


= (+ of spanning arborescences of K>'4* rooted to r) H deg” u u—1 |! 


„n—2 peg 


(as we saw in Subsection] 5. 14.5|in the case when r=1, 
and can similarly prove for arbitrary r) 


n 
=n" 2. IT] @-2)!= n”? . (n—2)!", 
u=1 


qed. O 


In comparison, there is no good formula known for the # of Eulerian circuits 
of the undirected graph Kn. For n even, this # is 0 of course (since K, has 
vertices of odd degree in this case). For n odd, the # grows very fast, but little 


else is known about it (see https: //oeis.org/A135388) for some known values, 
and see Exercise for a divisibility property). 


Exercise 5.21. Let n be a positive integer. Let N = {1,2,...,n}. A map 
f: N + N is said to be n-potent if each i € N satisfies f”! (i) = n. (As 
usual, f* denotes the k-fold composition f o f o---o f.) 

Prove that the # of n-potent maps f : N > N is n"~?. 


[Hint: What do these n-potent maps have to do with trees?] 


Exercise 5.22. Let n = 2m + 1 > 2 be an odd integer. Let e be an edge of the 
(undirected) complete graph Kn. Prove that the # of Eulerian circuits of K, 
that start with e is a multiple of (m — 1)!”. 


[Hint: Argue that each Eulerian circuit of K, is an Eulerian circuit of a 
unique balanced tournament. Here, a “balanced tournament” means a bal- 
anced digraph obtained from Ky, by orienting each edge.] 
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5.15. The undirected Matrix-Tree Theorem 
5.15.1. The theorem 


The Matrix-Tree Theorem becomes simpler if we apply it to a digraph of the 
form GPidir. 


Theorem 5.15.1 (undirected Matrix-Tree Theorem). Let G = (V,E,g) bea 
multigraph. Assume that V = {1,2,...,n} for some positive integer n. 

Let L be the Laplacian of the digraph GPidi", Explicitly, this is the n x n- 
matrix L € Z"*" whose entries are given by 


Lij = (degi) - [i = j] — aij, 


where 4; ; is the # of edges of G that have endpoints į and j (with loops 
counting twice). Then: 


(a) For any vertex r of G, we have 


(# of spanning trees of G) = det (Lx;~;). 


(b) Let t be an indeterminate. Expand the determinant det (tI, + L) (here, 
In denotes the n x n identity matrix) as a polynomial in t: 


det (£14 +L) = cyt™ + cp_zt™ 1 + «+++ eyt! + et, 


where Co,C1,...,Cy, are numbers. (Note that this is the characteristic 
polynomial of L up to substituting —t for t and multiplying by a power 
of —1. Some of its coefficients are cp = 1 and c,_; = Tr L and co = 
det L.) Then, 


1 
(# of spanning trees of G) = 701 


(c) Let A1, À2,..., Àn be the eigenvalues of L, listed in such a way that 
Àn = 0 (we know that 0 is an eigenvalue of L, since L is singular). 
Then, 


1 
(# of spanning trees of G) = =i AqAg +e Àn: 


Proof. (a) Let r be a vertex of G. Then, Proposition [5.13.1] (b) shows that there is 
a bijection 


{spanning arborescences of G'd" rooted to r} — {spanning trees of G}. 
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Hence, by the bijection principle, we have 

(# of spanning trees of G) 

= (# of spanning arborescences of GP!" rooted to r) 

= det (Livpivr) (by the Matrix-Tree Theorem (Theorem [5.14.7)) . 
This proves Theorem [5.15.1] (a). 

(b) We claim that : 
c = D Cis (23) 

Note that this is a purely oer result, and has nothing to do with the 
fact that L is the Laplacian of a digraph; it holds just as well if L is replaced by 


any square matrix. 


Once is proved, Theorem]5.15.1|(b) will easily follow, because entails 


1 1# Le 
=c] = — det (Larn =— # of spanning trees of G 
aa 3 (Larr) ii Ł ( P 8 ) 
=(# of spanning trees of G) 
(by Theorem (a)) =n: (# of spanning trees of G) 


1 
S (# of spanning trees of G) = (# of spanning trees of G). 


Thus, it remains to prove (23). 

A rigorous proof of can be found in Proposition 6.4.29] or in 
(both of these references actu- 
ally describe all coefficients cg, c1, . . . , Cn of the polynomial det (tI„ + L), not just 
the t'-coefficient c1). We shall merely outline the proof of on a convenient 
example. We want to compute c1. In other words, we want to compute the 
coefficient of t! in the polynomial det (tI, + L) (since c4 is defined to be this 
very coefficient). Let us say that n = 4, so that L has the form 


a b c d 
L = 2 . iA 
a” b? d d 
a"! pl ol qm 
Thus, 
t+a b c d 


det (tI, + L) = det al! E g" 
ql" p” cl! t+ d" 


Imagine expanding the right hand side (using the Leibniz formula) and ex- 
panding the resulting products further. For instance, the product 


(t+a) (t+b') aie! 
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becomes ttd” c + tb'd”c + atd” c + ab'd''c'". In the huge sum that results, 
we are interested in those addends that contain exactly one t, because it is 
precisely these addends that contribute to the coefficient of t! in the polynomial 
det (tI, + L). Where do these addends come from? To pick up exactly one t 
from a product like (t + a) (t + b’) d"c", we need to have at least one diagonal 
entry in our product (for example, we cannot pick up any t from the product 
cd'b''a'"), and we need to pick out the t from this diagonal entry (rather than, 
e.g., the a or b' or c” or d”). If we pick the r-th diagonal entry, then the rest of 
the product is part of the expansion of det (L.,,~,) (since we must not pick any 
further ts and thus can pretend that they are not there in the first place). Thus, 


n 
the total t!-coefficient in det (tI, + L) will be $}, det (L~r~r). This proves (23), 
r=1 
and thus the proof of Theorem [5.15.1| (b) is complete. 


(c) Consider the polynomial det (tI, + L) introduced in part (b), and in par- 
ticular its t!-coefficient c1. 

It is known that the characteristic polynomial det (tI, — L) of L is a monic 
polynomial of degree n, and that its roots are the eigenvalues \1,A2,...,An of 
L. Hence, it can be factored as follows: 


det (tl, — L) = (t — Aq) (t — Az) +- (t — Àn). 
Substituting —t for t on both sides of this equality, we obtain 
det (—tIn — L) = (—t — Aq) (—t — A2) +- (—t — Àn). 


Multiplying both sides of this equality by (—1)", we find 


det (tI, + L) = (t + 1) (t+ A2) (t+ Àn) 
= (t + d1) (E+A2)--- (t+ Àn—1)t (since Àn = 0). 


Hence, the t!-coefficient of the polynomial det (tI, + L) is A4À2- + + Àn—1 (since 
this is clearly the t'-coefficient on the right hand side). Since we defined cı 
to be the t!-coefficient of the polynomial det (tI, + L), we thus conclude that 
Cy = AyAz- ++ Àn—1. However, Theorem [5.15.1] (b) yields 


1 1 
(# of spanning trees of G) = — c = >- MA2- Àp- 
n So n 
=MAz Àn- 
This proves Theorem |5.15.1}(c). O 


5.15.2. Application: counting spanning trees of Kym 


Laplacians of digraphs often have computable eigenvalues, so Theorem |9.15.1 
(c) is actually pretty useful. A striking example of a # of spanning trees (specifi- 
cally, of the n-hypercube graph Qn, which we already met in Subsection (2.14.4) 
that can be counted using eigenvalues will appear in Exercise |5.26 
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Here, however, let us give a simpler example, in which Theorem [5.15.1] (a) 
suffices: 


Exercise 5.23. Let n and m be two positive integers. Let Kn,m be the simple 
graph with n + m vertices 


i eer | and 1,-—2,...,—m, 


where two vertices i and j are adjacent if and only if they have opposite 
signs (i.e., each positive vertex is adjacent to each negative vertex, but no two 
vertices of the same sign are adjacent). 

[For example, here is how Ks, looks like: 


How many spanning trees does Ky, have? 


Solution. If we rename the negative vertices —1,—2,...,—masn+1,n+2,...,0+ 
m, then the Laplacian L of the digraph K can be written in block-matrix no- 
tation as follows: 


where 


e A is a diagonal n x n-matrix whose all diagonal entries are equal to m 
(since there are no edges between positive vertices, and since each positive 
vertex has degree m); 


e Bis ann x m-matrix whose all entries equal —1; 
e Cis an m x n-matrix whose all entries equal —1; 


e D is a diagonal m x m-matrix whose all diagonal entries are equal to n. 


For instance, if n = 3 and m = 2, then 
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Theorem [5.15.1| (a) yields 
(# of spanning trees of Ky) = det (Lar ~r) for any vertex r of Kym; 


thus, we need to compute det (L~r ~r) for some vertex r. We let r = 1. Then, the 
submatrix Ly, = Lxi, 1 of L again can be written in block-matrix notation 


as follows: ee 
A B 
Limp roy E ( ‘a D J (24) 
where 
e A is a diagonal (n — 1) x (n — 1)-matrix, whose all diagonal entries are 
equal to m; 


e Bisan (n — 1) x m-matrix whose all entries equal —1; 
e Cis an m x (n — 1)-matrix whose all entries equal —1; 


e D is a diagonal m x m-matrix whose all diagonal entries are equal to n. 


Fortunately, determinants of block matrices are often not hard to compute, at 
least when some of the blocks are invertible. For example, the Schur comple- 
ment provides Our life here is even easier, since A and D are 
multiples of identity matrices: namely, A = ml 1 and D = nIm. We perform 


C D 
specifically subtracting the CA~!-multiple of the first “block row” ( A B ) 


a “blockwise row transformation” on the block matrix Lyr ~r = ( a a j 


from the second “block row” ( Č D ) (yes, this is legitimate — it’s the same as 

In-1 0 
—CA In 
1 because it is lower-triangular). As a result, we obtain 


det A B =det| — A pe B Z 
C D C—CA!A D-CA-!B 


= det 2 - a aa Hg 
0 D-CA`!B 
The matrix on the right is “block-upper triangular”, so its determinant factors 
as follows/®4| 


det A B — | = det. det (D = CAB) . 
0 D-CA-1B 


left-multiplying by the block matrix ( i which has determinant 


68We are using the fact that if a matrix is block-triangular (with all diagonal blocks being square 
matrices), then its determinant is the product of the determinants of its diagonal blocks. 


See, e.g., https: //math.stackexchange. com/a/1221066/jor |Grinbe20) Exercise 6.29] for a 


proof of this fact. 


An introduction to graph theory, version August 2, 2023 page 243 


Of course, det A = m"—1, since A is a diagonal matrix with m,m,...,m on the 
diagonal. Computing det (D — CAB) is a bit more complicated, but still 


doable: The matrix A~! is a diagonal matrix with m—!,m7!,...,m~! on the 


aapna thus, its role in the product CA~'B is merely to multiply everything 
by m—!. Hence, CA~'B = m~'CB. Since all entries of C and B are —1’s, we 
see that all entries of CB are (n—1)’s. Putting all of this together, we see 
that D — CA~!B is the m x m-matrix whose all diagonal entries are equal to 
n —m~'(n—1) and whose all off-diagonal entries are equal to —m~!(n—1). 
We have already computed the determinant of a matrix much like this back in 
our proof of Cayley’s Formula (Subsection [5.14.5); let us deal with the general 
case: 


Proposition 5.15.2. Let n € IN. Let x and a be two numbers. Then, 


x aa aa 
a x a aa 
Aaa x aa 
det = (x+(n—1)a) (x-a)" 
aaa x a 
a a a x 


the 1 x n-matrix 
whose diagonal entries are x 
and whose off-diagonal entries are a 


Proposition |5.15.2}can be proved using similar reasoning as the determinant 
in Subsection [5.14.5} we will say more about it later. For now, let us apply it to 
m,n—m! (n— 1) and —m-! (n — 1) instead of n, x and a, to obtain 


det (D-CA~'B) = ((n- m0 (n 1)) + (m 1) ( m’ (n—1))) 


m—1 
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Now, it is time to combine everything we know. Theorem [5.15.1] (a) yields 
(# of spanning trees of Kym) = det (L~;,~,) 


= det A 2 7 
0 D—CA™!B 
= det A - det (D — CAB) 
1 NÅ— AM 
=m" 


=ym-1 


Thus, we have obtained the following: 


Theorem 5.15.3. Let n and m be two positive integers. Let Ky, be the simple 
graph with n + m vertices 


1,2,..., and 1,—2,...,—m, 


where two vertices i and j are adjacent if and only if they have opposite signs. 
Then, 


(# of spanning trees of Kym) = mm” tg, 


See [AbuSbe88] for a combinatorial proof of this theorem. 


Exercise 5.24. Let n be a positive integer. Let K;,,2 be the simple graph with 
vertex set {1,2,...,n} U {—1, —2} such that two vertices of K;,2 are adjacent 
if and only if they have opposite signs (i.e., each positive vertex is adjacent to 
each negative vertex, but no two vertices of the same sign are adjacent). We 
regard K,,2 as a multigraph in the usual way. 


(a) Without using the matrix-tree theorem, prove that the number of span- 
ning trees of Ky,» is n. gial 


(b) Let Ki, be the graph obtained by adding a new edge {—1, —2} to Ky. 
How many spanning trees does Ki, have? 


[Example: Here is the graph K,,2 for n = 5: 
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And here is the corresponding graph Ki, 9: 


] 


Exercise 5.25. Let n be a positive integer. Let A be the (n— 1) x (n—1)- 
matrix 


2 —1 0 0 
=], 2 =] 0 
0 -1 2 o. 
0 0 0 2 
whose (i, j)-th entry is 
2, ifi = j; 
Aij:=} 1, if |i-j] =1; for all i,j € {1,2,...,.n—1}. 


0, otherwise 


Prove that det A = n. 
[Hint: Recall Example [5.4.4]] 


Exercise 5.26. Let n be a positive integer. Let Q, be the n-hypercube graph (as 
defined in Definition 2.14.7). Recall that its vertex set is the set V := {0,1}” 
of length-n bitstrings, and that two vertices are adjacent if and only if they 
differ in exactly one bit. Our goal is to compute the # of spanning trees of 
Qn. 

Let D be the digraph Qbi4i". Let L be the Laplacian of D. We regard L as a 
V x V-matrix (i.e., as a 2” x 2”-matrix whose rows and columns are indexed 
by bitstrings in V). 

We shall use the notation a; for the i-th entry of a bitstring a. Thus, each 
bitstring a € V has the form a = (a1, a2,...,an). (We shall avoid the short- 
hand notation 4142 - - - an here, as it could be mistaken for an actual product.) 

For any two bitstrings a,b € V, we define the number (a,b) to be the 
integer a,b, + a2b2 +--+ + anbn. 
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(a) Prove that every bitstring a € V satisfies 


D (—1) (0) _ 2”, ifa= 0; 
bev 0, otherwise. 


Here, 0 denotes the bitstring (0,0,...,0) € V. 


Now, define a further V x V-matrix G by requiring that its (a, b)-th entry 
is 
Gab = (an for any a,b € V. 


Furthermore, define a diagonal V x V-matrix D by requiring that its (a,a)-th 
entry is 


Daa = 2- (# of i € {1,2,...,n} such that a; = 1) 
= 2. (the number of 1s in a) for anya € V 


(and its off-diagonal entries are 0). 
Prove the following: 


(b) We have G2 = 2" . I, where I is the identity V x V-matrix. 
(c) We have GLG! = D. 
(d) The eigenvalues of L are 2k for all k € {0,1,...,n}, and each eigenvalue 


2k appears with multiplicity (a) n 


(e) The # of spanning trees of Q, is 


[Example: As an example, here is the case n = 3. In this case, the graph 
Qn looks as follows: 
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The matrices L, G and D are 


3 —1 —1 0 =f 0 0 ©Ọ0 
—1 3 0 —1 0 =k 0 0 
—1 0 3 =1. 0 0 =) 0 
0 -1 -1 3 0 0 0 -~-i 
—1 0 0 0 3 =1 -1 0 


1 1 1 1 1 1 1 1 
1 —1 1 —1 1 —1 1 -1 
1 1 —1 —1 1 1 —1 -1 

G= 1 —1 —1 1 1 —1 —1 1 

tt 1 1 1 1 —1 -1 -1]’ 
Leal 1 —1 —1 1 —1 1 
1 1 1 —1 —1 —1 1 1 
1 —1 =k 1 —1 1 1 =! 
00000000 
02000000 
00200000 

D= 00020000 
00004000’ 
00000400 
00000040 
00000006 


where the rows and the columns are ordered by listing the eight bitstrings 
a € V in the order 000, 001, 010, 011, 100, 101, 110, 111. ] 


As we promised, let us make a few more remarks about Proposition [5.15.2 
While this proposition can be proved by fairly straightforward row transforma- 
tions (first subtracting the first row from all the other rows, then factoring an 
x — a from all the latter rows, then subtracting a times each of the latter rows to 
the first row to obtain a triangular matrix), it can also be viewed as a particular 
case of either of the following two determinantal identities: 
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Proposition 5.15.4. Let n € N. Let a1,42,...,a, be n numbers, and let x be a 
further number. Then, 


xX A ad **: Ayn-1 n 
a4, X ag `: Anı n 
a a X e.. f a n n 
det| © 7 g ! "Ts [: +) «| (x — ai) 
. : . i=1 i=1 
a, ag ag <*: x an 
a, ag a3 °:°: an x 


an (n+1)x (n+1)-matrix 


Proposition 5.15.5. Let n € IN. Let x1, X2, . . ., Xn be n numbers, and let a be a 
further number. Then, 


xy a a a 
a xn a a F E 
det a A X3 a = (xi E a) +a $ Yi 
: i=1 i=1 
a aa Xn 
where we set y; := IĮ]  (x,—4) for each i € {1,2,...,n}. 
ke{1,2,...n}; 
kti 


Both of these propositions make good exercises in determinant evaluation. 
(Proposition|5.15.4Jis [Grinbe20| Exercise 6.21], while Proposition|5.15.5 is|https ://math.stackexch 
-) 

See [KleSta19] and [Rubey00] for more applications of the Matrix-Tree Theo- 
rem, and [Holzer22] for many more related results. 


5.16. de Bruijn sequences 
5.16.1. Definition 


Let me move on to a more intricate application of what we have learned about 
arborescences. 
A little puzzle first: What is special about the periodic sequence 


|| : 0000 1111 0110 0101 : || ? 


(This is an infinite sequence of 0’s and 1’s; the spaces between some of them 
are only for readability. The || : and : || symbols are “repeat signs” — they mean 
that everything that stands between them should be repeated over and over. So 
the sequence above is 0000 1111 0110 0101 0000 1111 ....) 
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One nice property of this sequence is that if you slide a “length-4 window” 
(i.e., a window that shows four consecutive entries) along it, you get all 16 
possible bitstrings of length 4 depending on the position of the window, and 
these bitstrings do not repeat until you move 16 steps to the right. Just see: 

0000 }11110110010100001111... 

0} 0001 }1110110010100001111... 

00} 0011 110110010100001111... 

000, 0111 j10110010100001111... 

0000, 1111 0110010100001111... 

00001) 1110 j110010100001111... 

000011) 1101 10010100001111... 

0000111} 1011 0010100001111... 

000011110110 010100001111... 

000011110, 1100 10100001111... 

0000111101} 1001 0100001111... 

000011110110010 100001111... 

000011110110,0101 |00001111... 

0000111101100, 10100001111... 

00001111011001} 0100001111... 

000011110110010 1000 01111... 
Note that, as you slide the window along the sequence, at each step, the first 
bit is removed and a new bit is inserted at the end. Thus, by sliding a length-4 
window along the above sequence, you run through all 16 possible length-4 
bitstrings in such a way that each bitstring is obtained from the previous one 
by removing the first bit and inserting a new bit at the end. This is nice and 
somewhat similar to Gray codes (in which you run through all bitstrings of a 
given length in such a way that only a single bit is changed at each step). 


Can we find such nice sequences for any window length, not just 4 ? 
Here is an answer for window length 3, for instance: 


I|: 00011101 : |] . 


What about higher window length? 

Moreover, we can ask the same question with other alphabets. For instance, 
instead of bits, here is a similar sequence for the alphabet {0,1,2} (that is, we 
use the numbers 0, 1,2 instead of 0 and 1) and window length 2: 


|| : 00 11 22021 : ||. 
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What about the general case? Let us give it a name: 


Definition 5.16.1. Let n and k be two positive integers, and let K be a k- 
element set. 

A de Bruijn sequence of order n on K means a k”-tuple (co, c1, . - -, Cgn—1) 
of elements of K such that 


(A) for each n-tuple (a1, a2,...,an) E K” of elements of K, there is a unique 
r € {0,1,...,k” — 1} such that 


(41, A2,- an) = (cr, Crops Cr4n—1) : 
Here, the indices under the letter “c” are understood to be periodic modulo 
k"; that is, we set Cg44u = cq for each q € Z (so that cyn = cg and cyn 41 = C1 
and so on). 


For example, for n = 2 and k = 3 and K = {0,1,2}, the 9-tuple 
WOT 250.841) 


is a de Bruijn sequence of order n on K, because if we label the entries of this 
9-tuple as co,ci,...,cg (and extend the indices periodically, so that co = co), 
then we have 


This de Bruijn sequence (0,0,1,1,2,2,0,2,1) corresponds to the periodic se- 
quence || : 00 11 22 02 1 : || that we found above. 


5.16.2. Existence of de Bruijn sequences 


It turns out that de Bruijn sequences always exist: 


Theorem 5.16.2 (de Bruijn, Sainte-Marie). Let n and k be positive integers. 
Let K be a k-element set. Then, a de Bruijn sequence of order n on K exists. 


Proof. It looks reasonable to approach this using a digraph. For example, we 
can define a digraph whose vertices are the n-tuples in K”, and that has an arc 
from one n-tuple i to another n-tuple j if j can be obtained from i by dropping 
the first entry and adding a new entry at the end. Then, a de Bruijn sequence 
(of order n on K) is the same as a Hamiltonian cycle of this digraph. 

Unfortunately, we don’t have any useful criteria that would show that such a 
cycle exists. So this idea seems to be a dead end. 
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However, let us do something counterintuitive: We try to reinterpret de 
Bruijn sequences in terms of Eulerian circuits (rather than Hamiltonian cycles), 
since we have a good criterion for the existence of Eulerian circuits (unlike for 
that of Hamiltonian cycles)! 

We need a different digraph for that. Namely, we let D be the multidigraph 
(K"-1, K", y), where the map p : K” > K"~! x K"! is given by the formula 


W (a1,2,...,4n) = ((a1,a2,...,an—1), (a2,43,...,@n))- 


Thus, the vertices of D are the (n — 1)-tuples (not the n-tuples!) of elements 
of K, whereas the arcs are the n-tuples of elements of K, and each such arc 
(a1,42,...,4n) has source (a1,42,...,aņn—1) and target (a2,a3,...,a,). Hence, 
there is an arc from each (n — 1)-tuple i € K"! to each (n — 1)-tuple j € K”! 
that is obtained by dropping the first entry of i and adding a new entry at the 
end. (Be careful: If n = 1, then D has only one vertex but n arcs. If this confuses 
you, just do the n = 1 case by hand. For any n > 1, there are no parallel arcs in 
D.) 


Example 5.16.3. For example, if n = 3 and k = 2 and K = {0,1}, then D 
looks as follows (we again write our tuples without commas and without 
parentheses): 


Let us make a few observations about D: 


e The multidigraph D is strongly connected. 


[Proof: We need to show that for any two vertices i and j of D, there is a 
walk from i to j. But this is easy: Just insert the entries of j into 7 one by 
one, pushing out the entries of i. In other words, using the notation kp for 
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the p-th entry of any tuple k, we have the walk 


P= Geta) 
—> (io, 13, died pins f1) 
=> (i3, 14, ee Ante) 


= (in-17 jir jz sists ,Jn—2) 
=e (ju jz. i ,jn—1) = j- 


Note that this walk has length n — 1, and is the unique walk from i to j 
that has length n — 1. Thus, the # of walks from i to j that have length 
n -— 1 is 1. This will come useful further below.] 


e Thus, the multidigraph D is weakly connected (since any strongly con- 
nected digraph is weakly connected). 


e The multidigraph D is balanced, and in fact each vertex of D has outde- 
gree k and indegree k. 


[Proof: Let i be a vertex of D. The arcs with source i are the n-tuples 
whose first n — 1 entries form the (n — 1)-tuple i while the last, n-th entry 
is an arbitrary element of K. Thus, there are |K| many such arcs. In other 
words, i has outdegree k. A similar argument shows that 7 has indegree 
k. This entails that deg” i = deg" i. Since this holds for every vertex i, we 
conclude that D is balanced.| 


e The digraph D has an Eulerian circuit. 


[Proof: This follows from the directed Euler—Hierholzer theorem (Theorem 
(4.7.2), since D is weakly connected and balanced. Alternatively, we can 
derive this from the BEST theorem (Theorem |5.9.1) as follows: Pick an 
arbitrary arc a of D, and let r be its source. Then, r is a from-root of D 
(since D is strongly connected), and thus D has a spanning arborescence 
rooted from r (by Theorem 5.8.4). In other words, using the notations of 
the BEST theorem (Theorem 55.9.1), we have t (D,r) # 0. Moreover, each 
vertex of D has indegree k > 0. Thus, the BEST theorem yields 


€(Dya) = (Dyr IT (deg” u—1)! £0. 
x0 uE 


£0 


But this shows that D has an Eulerian circuit whose last arc is a.] 


So we know that D has an Eulerian circuit c. This Eulerian circuit leads to a 
de Bruijn sequence as follows: 

Let po, P1,---,Pkn—1 be the arcs of c (from first to last). Extend the subscripts 
periodically modulo k” (that is, set p+" = Pq for all q € IN). Thus, we obtain 
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an infinite walk with arcs po, pı, p2, ... (since c is a circuit). In other words, 
for each i € N, the target of the arc p; is the source of the arc pj+1. 

In other words, for each i € N, the last n — 1 entries of p; are the first n — 1 
entries of pj,1 (since the target of p; is the tuple consisting of the last n — 1 
entries of p;, whereas the source of pj+1 is the tuple consisting of the first n — 1 
entries of p;+1). Therefore, for each i € IN and each j € {2,3,...,n}, we have 


(the j-th entry of p;) 
= (the (j — 1)-st entry of pi+1). (25) 


Now, for each i € N, we let x; denote the first entry of the n-tuple p;. Then, 
Xg+kr = Xq for all q € N (since p+» = pq for all q € N). In other words, 
the sequence (xọ, x1, X2, . . .) repeats itself every k” terms. Note that the k”-tuple 
(x0, X1, - - -, Xkn—1) consists of the first entries of the arcs po, p1,- - -, Pn—1 Of c (by 
the definition of x;). 

For each i € N and each s € {1,2,...,n}, we have 


(the s-th entry of p;) 


= (the (s — 1) -st entry of p;41) (by (25)) 
= (the (s — 2) -nd entry of pi+2) (by (25)) 


= (the (s — 3) -rd entry of p;+3) (by (25)) 


= (the 1-st entry of pj+5-1) 
= Mite] (since x;;;_1 was defined as the first entry of p;,;_1). 


In other words, for each i € N, the entries of p; (from first to last) are 
Xi, Xi441,+++,Xj+y—1- In other words, for each i € IN, we have 


pi = (Xi, an A) (26) 


Now, recall that c is an Eulerian circuit. Thus, each arc of D appears exactly 
once among its arcs po, p1, - - -, Pxx—1- In other words, each n-tuple in K” appears 
exactly once among po, p1,- - -, Px—1 (since the arcs of D are the n-tuples in K”). 
In other words, as i ranges from 0 to k” — 1, the n-tuple p; takes each possible 
value in K” exactly once. 

In view of (26), we can rewrite this as follows: As i ranges from 0 to k” — 1, 
the n-tuple (xi, Xi+1,---,Xi+n—1) takes each possible value in K” exactly once 
(since this n-tuple is precisely p;, as we have shown in the previous para- 
graph). In other words, for each (a1,4a2,...,an) € K”, there is a unique r € 
{0,1,..., k" — 1} such that (a1,a2,..., an) = (Xr, Xr417 <- +; Xr4n-1)- 

Hence, the k”-tuple (xo, x1, . . ., Xķn—1) is a de Bruijn sequence of order n on 
K. This shows that a de Bruijn sequence exists. Theorem [5.16.2ļis thus proven. 


69We have never formally defined infinite walks, but it should be fairly clear what they are. 
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Example 5.16.4. For n = 3 and k = 2 and K = {0,1}, one possible Eulerian 
circuit c of D is 


(00, 001, 01, 010, 10, 101, 01, 011, 11, 111, 11, 110, 10, 100, 00) 


(where we have written the arcs in bold for readability). The first entries of 
the arcs of this circuit form the sequence 0010111, which is indeed a de Bruijn 
sequence of order 3 on {0,1}. Any 3 consecutive entries of this sequence 
(extended periodically to the infinite sequence || : 0010111 : ||) form the 
respective arc of c. 


O 


Theorem [5.16.2] is merely the starting point of a theory. Several specific de 
Bruijn sequences are known, many of them having peculiar properties. See 
for a survey of various such sequences” (note that they are called 
“full length nonlinear shift register sequences” in this survey) 

There are also several variations on de Bruijn sequences. For some of them, 
see [ChDiGr92]. (Note that some of the open questions in that paper are still 
unsolved.) A variation that recently became quite popular is the notion of a 
“universal cycle for permutations” — a string that contains all “permutations” 
(more precisely, n-tuples of distinct elements of K) as factors. See 
for some recent progress on minimizing the length of such a string, including 
a contribution by a notorious hacker known as 4chan. (This is no longer really 
about Eulerian circuits, since some amount of duplication cannot be avoided in 
these strings.) 


5.16.3. Counting de Bruijn sequences 


Let us move in a different direction. Having proved the existence of de Bruijn 
sequences in Theorem 5.16.2} let us try to count them! 

Question. Let n and k be two positive integers. Let K be a k-element set. 
How many de Bruijn sequences of order n on K are there? 

To solve this, it makes sense to apply the BEST theorem to the digraph D 
we have constructed above. Alas, D is not of the form Gidi! for some undi- 
rected graph G, so we cannot apply the undirected MTT (Matrix-Tree Theo- 
rem). However, D is a balanced multidigraph, and for such digraphs, a version 
of the undirected MTT still holds: 


Some of these sequences (the “prefer-one” and “prefer-opposite” generators) are just dis- 
guised implementations of the algorithm for finding an Eulerian circuit implicit in our 
proof of the BEST theorem. 

71My favorite is the one obtained by concatenating all Lyndon words whose length divides 
n in lexicographically increasing order (assuming that the set K is totally ordered). See 


for the details of that construction. 
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Theorem 5.16.5 (balanced Matrix-Tree Theorem). Let D = (V,A,) be a 
balanced multidigraph. Assume that V = {1,2,...,n} for some positive 
integer n. 

Let L be the Laplacian of D. Then: 


(a) 


(b) 


(c) 


(d) 


For any vertex r of D, we have 
(# of spanning arborescences of D rooted to r) = det (Lx;~;). 


Moreover, this number does not depend on r. 


Let t be an indeterminate. Expand the determinant det (tI, + L) (here, 
I, denotes the n x n identity matrix) as a polynomial in t: 


det (tIn + L) = cnt” + cy_1t™ 1 +--+ + cyt! + eof”, 


where Cg,C1,...,Cyn are numbers. (Note that this is the characteristic 
polynomial of L up to substituting —t for t and multiplying by a power 
of —1. Some of its coefficients are cn = 1 and c,_; = Tr L and co = 
det L.) Then, for any vertex r of D, we have 


1 
(# of spanning arborescences of D rooted to r) = 761 


Let Ay,A2,...,An be the eigenvalues of L, listed in such a way that 
Àn = 0. Then, for any vertex r of D, we have 


1 
(# of spanning arborescences of D rooted to r) = Pi Aq Ags Àn: 


Let A1,Az2,...,An be the eigenvalues of L, listed in such a way that 
An = 0. If all vertices of D have outdegree > 0, then 


1 
(# of Eulerian circuits of D) = |A|- a AyA2°+++An—1° I] (deg* u— 1) L 
ucV 


(If you identify an Eulerian circuit with its cyclic rotations, then you 
should drop the |A| factor on the right hand side.) 


Proof. (a) The equality comes from the MTT (Theorem [5.14.7). It remains to 
prove that the # of spanning arborescences of D rooted to r does not depend 
on r. But this is Corollary 5.12.1] 

(b) follows from (a) as in the undirected graph case (proof of Theorem [5.15.1 


(b)) (73 


72Īn more detail: Just as we proved in our above proof of Theorem [5.15.1] (for the undirected 
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(c) follows from (b) as in the undirected graph case (proof of Theorem [5.15.1 
(c)). 
(d) Assume that all vertices of D have outdegree > 0. Then, 
(# of Eulerian circuits of D) 


= D (# of Eulerian circuits of D whose first arc is a) 
acA 


However, if a € A is any arc, and if r is the source of a, then 


(# of Eulerian circuits of D whose first arc is a) 


= (# of spanning arborescences of D rooted to r) - Il (deg* u —1)! 
uceV 


(by the BEST’ theorem (Theorem |5.10.4)) 
1 
= AyAz+++An-1- [| (degt u — 1)! (by part (c)) . 
ucV 


Hence, 


(# of Eulerian circuits of D) 


= }_ (# of Eulerian circuits of D whose first arc is a) 


at MAA (a + 1)! 
pah . oe = . e y= ! 
n ae Be L, 6 


= Sear eee era (deg* u — 1)! 


acA n uEeV 


1 
= |A|- a AyAz+++An-1- | | (deg? u-1)!. 
ueV 
This proves part (d). O 
n 
case), we have cy = } det (L~r,~r). However, part (a) shows that the number det (Lar ~r) 
r=1 
n 
does not depend on r. Thus, the sum }, det (L~r ~r) consists of n equal addends, which 
r=1 


can be written as det (L~r,~r) for any vertex r of D. Therefore, this sum can be rewritten 


n 
as n- det (L~r,~r) for any vertex r of D. Hence, the equality c1 = > det (L~r,~r) can be 
r=1 


1 
rewritten as c} = n- det (L~r ~r) for any vertex r of D. Therefore, det (Lar ~r) = aa for 
any vertex r of D. Since part (a) yields 


(# of spanning arborescences of D rooted to r) = det (Larr), 


we can rewrite this equality as 


1 
(# of spanning arborescences of D rooted to r) = zer 
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Now, let’s try to solve our question — i.e., let’s count the de Bruijn sequences 
of order n on K. 

Recall the digraph D from our above proof of Theorem[5.16.2| We constructed 
a de Bruijn sequence of order n on K by finding an Eulerian circuit of D. This 
actually works both ways: The map 


{Eulerian circuits of D} — {de Bruijn sequences of order n on K}, 


c++ (the sequence of first entries of the arcs of c) 


is a bijection (make sure you understand why!). Hence, by the bijection princi- 
ple, we have 


(# of de Bruijn sequences of order n on K) 
= (# of Eulerian circuits of D). (27) 


By Theorem [.16.5| (d), however, we have 


(# of Eulerian circuits of D) 


= |K"| j ES ‘ AyA2 oe *Ayn-1_4 , Il (deg* u— 1)!, (28) 


n—1 
k ucKn-1 


where A1,A2,...,Ajn-1 are the eigenvalues of the Laplacian L of D, indexed in 
such a way that Ay.1 = 0. (Note that the digraph D = (K"~!,K",) has k"! 
vertices, not n vertices, so the “n” in Theorem |5.16.5]is k”~! here.) 

As we know, each vertex of D has outdegree k. That is, we have deg’ u = k 
for each u € K"~!. Thus, 


I] (degtu—1)!= JT] (k-1)! = (k-19. 


ucKn-1 ucKn-1 


Also, 


1 1 


[K"| 
It remains to find AyA2- +--+ Ajn-1_,. What are the eigenvalues of L ? 

The Laplacian L of our digraph D is a k"~! x k"~!-matrix whose rows and 
columns are indexed by (n —1)-tuples in K"~!. Strictly speaking, we should 
relabel the vertices of D as 1,2,...,k"~! here, in order to have a “proper matrix” 
with a well-defined order on its rows and columns. But let’s not do this; instead, 
I trust you can do the relabeling yourself, or just use the more general notion 
of matrices that allows for the rows and the columns to be indexed by arbitrary 
things (see https: //nathoverflow net /questions/317105 for details). 

Let C be the adjacency matrix of the digraph D; this is the k"~! x k”~+-matrix 
(again with rows and columns indexed by (n — 1)-tuples in K”~!) whose (i, j)- 
th entry is the # of arcs with source i and target j. In particular, the trace of C 
is thus the # of loops of D. It is easy to see that the loops of D are precisely the 
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arcs of the form (x,x,...,x) € K” for x € K; thus, D has exactly k loops. Hence, 
the trace of C is k. 
Recall the definition of the Laplacian matrix L. We can restate it as follows: 


LEAG (29) 


where A is the diagonal matrix whose diagonal entries are the outdegrees of 
the vertices of D. Since each vertex of D has outdegree k, the latter diagonal 
matrix A is simply k - I, where I is the identity matrix (of the appropriate size). 
Hence, can be rewritten as 


L=k.I-C. 


Thus, if y1, Y2,- --, Ygı-1 are the eigenvalues of C, then k — y1, k — 72,...,k — 
Ypn-1 are the eigenvalues of L. Computing the former will thus help us find the 
latter. 

Furthermore, let J be the k"~! x k"~!-matrix (again with rows and columns 
indexed by (n — 1)-tuples in K"~') whose all entries are 1. It is easy to see that 
the eigenvalues of J are 

O00 yes 
— 


k"-1—1 many zeroes 


(The easiest way to see this is by noticing that J has rank 1 and trace k"~!. 73) 
Now, here is something really underhanded: We observe that 


Ce Si 


[Proof: We need to show that all entries of the matrix C"~! are 1. So let i and 
j be two vertices of D. We must then show that the (i, j)-th entry of C"~! is 1. 

Recall the combinatorial interpretation of the powers of an adjacency matrix 
(Theorem 4.5.10): For any £ € N, the (i, /)-th entry of C° is the # of walks from 
i to j (in D) that have length £. Thus, in particular, the (i, j)-th entry of C”~! 
is the # of walks from i to j (in D) that have length n — 1. But this number 
is actually 1, as we have already shown in our above proof of Theorem [5.16.2 
This completes the proof of C”~! = J.] 


How does this help us compute the eigenvalues of C ? Well, let 71, Y2, - - - , Ygn-1 
be the eigenvalues of C. Then, for any £ € N, the eigenvalues of C‘ are 
yf, Pp bag Vs (this is a fact that holds for any square matrix, and is probably 
easiest to prove using the Jordan canonical form or triangularization). Hence, in 


73Here are the details: The matrix J has rank 1 (since all its rows are the same); thus, all but one 
of its eigenvalues are 0. It remains to show that the remaining eigenvalue is k"-1_ However, 
it is known that the sums of the eigenvalues of a square matrix equals its trace. Thus, if all 
but one of the eigenvalues of a square matrix are 0, then the remaining eigenvalue equals 
its trace. Applying this to our matrix J, we see that its remaining eigenvalue equals its trace, 
which is k"t. 
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particular, ja ee ey Yin- 1 are the eigenvalues of C’~! = J; but we know 


that the latter eigenvalues are 0,0,...,0 ,k"-!. Hence, all but one of the 
— ee 
k"-1_1 many zeroes 

k"—-1 numbers fae eee x ee equal 0. Thus, all but one of the k"-1 num- 
bers 71, 2,-++,Yg-1 equal 0 (we don’t know what the remaining number is, 
since (n — 1)-st roots are not uniquely determined in C). In other words, all but 
one of the eigenvalues of C equal 0. The remaining eigenvalue must thus be 
the trace of C (because the sum of the eigenvalues of a square matrix is known 
to be the trace of that matrix), and therefore equal k (since we know that the 
trace of C is k). 

So we have shown that the eigenvalues of C are 0,0,...,0  ,k. Thus, the 

— ee 


k"~1—1 many zeroes 
eigenvalues of L are 


k—0O, k—O, ..., k—0, k-k 
——— ee 
k"-1_1 many (k—0)’s 


(because if y1, Y2,---,Ygn-1 are the eigenvalues of C, then k — y1,k — 72,...,k — 
Ypn-1 are the eigenvalues of L). In other words, the eigenvalues of L are 


k,k,...,k ,0. 
—{ — 
k"-1_] many k’s 
Hence, the eigenvalues A1,A2,...,Ajn—1_1 in all equal k. Thus, simpli- 


fies to 


(# of Eulerian circuits of D) 


1 
SIR eT kk---k -> Il (degt u — 1)! 
———"_ pni factors YEK"! 


=k" zn =((k-1)!) 
=k 
ak: tek «(esi e “aan? 


k"-1_1 factors 


xn-1 


=kk"-1 
k”! 


= |k-(k—1)! = he, 
In view of this, we can rewrite as 
(# of de Bruijn sequences of order n on K) = KE, 


Thus, we have proved the following: 
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Theorem 5.16.6. Let n and k be positive integers. Let K be a k-element set. 
Then, 
n—1 
(# of de Bruijn sequences of order n on K) = k". 


What a nice (and huge) answer! 

Our above proof of Theorem |5.16.6)is essentially taken from [Stanle18) Chap- 
ter 10]. 

We note that a combinatorial proof of Theorem [5.16.6] (avoiding any use of 
linear algebra) has been recently given in [BidKis02]. 


5.17. More on Laplacians 


Much more can be said about the Laplacian of a digraph. The study of matrices 
associated to a graph or digraph is known as spectral graph theory; I’d say the 
Laplacian is probably the most prominent of these matrices (even though the 
adjacency matrix is somewhat easier to define). The original form of the matrix- 
tree theorem (actually a subtler variant of Theorem (a)) was found by 
Gustav Kirchhoff in his study of electricity (see §2.1.1] for 
a modern exposition); the effective resistance between two nodes of an electrical 
network is a ratio of spanning-tree counts and thus can be computed using the 
Laplacian (see, e.g., §2 and §3]). To be more precise, this relies on a 
“weighted count” of spanning trees, which is more general than the counting 
we have done so far; we will learn about it in the next section. 


Another application of Laplacians is to drawing graphs: see “spectral layout” 
or “spectral graph drawing” (e.g., [Gallie13]). 


5.18. On the left nullspace of the Laplacian 


Let me mention one more result about Laplacians of digraphs that answers a 
rather natural question you might already have asked yourself. Recall that the 
1 


1 
Laplacian L of a digraph D always satisfies Le = 0, where e = | _ |. Thus, 
1 
the vector e belongs to the right nullspace (= right kernel) of L. It is not hard 
to see that if D has a to-root and we are working over a characteristic-0 field, 
then e spans this nullspace, i.e., there are no vectors in that nullspace other than 
scalar multiples of e. (This is actually an “if and only if”.) What about the left 


nullspace of L ? Can we explicitly find a nonzero vector f with fL = 0? The 
answer is positive: 
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Theorem 5.18.1 (harmonic vector theorem for Laplacians). Let D = (V, A, y) 
be a multidigraph, where V = {1,2,...,n} for some n € N. 

For each r € V, let t (D,r) be the # of spanning arborescences of D rooted 
tor. 

Let f be the row vector (t(D,1), T(D,2), ..., T(D,n)). Then, fL = 0. 


Theoremb.18.1]| (or, more precisely, its weighted version, which we will see in 
the next section) can be used to explicitly compute the steady state of a Markov 
chain (see [KrGrWi10]); a similar interpretation, but in economical terms (emer- 
gence of money in a barter economy), appears in §1]. 


We shall give a proof of Theorem [5.18.1|based upon two lemmas. The first 


lemma is a general linear-algebraic result: 


Lemma 5.18.2. Let B be an n x n-matrix over an arbitrary commutative ring 
K. (For example, K can be R, in which case B is a real matrix.) Assume 
that the sum of all columns of B is the zero vector. Then, for any r,s,t € 
{1,2,...,n}, we have 


det (Bara) = (1) det (Baras): 


Proof of Lemma There are various ways to prove this, but here is probably 
the most elegant one: 

We WLOG assume that s Æ t, since otherwise the claim is obvious. Let us 
now change the r-th row of the matrix B as follows: 


e We replace the s-th entry of the r-th row by 1. 
e We replace the t-th entry of the r-th row by —1. 
e We replace all other entries of the r-th row by 0. 


Let C be the resulting n x n-matrix!4 Thus, C agrees with B in all rows other 
than the r-th one. Hence, in particular, 


Capek = Burk for each k € {1,2,...,n}. (30) 
a b c d 
d b d d 

74For example, if n = 4 and B = al W do g and s = 1 and t = 3 and r = 2, then 
al"! p” cll! q'" 


a b c d 


1 0 —1 0 
C= a! b! d qg" 
m ym m g" 


a Ç 
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Note also that the only nonzero entries in the r-th row of C ard C; s = 1 and 
C, = —1. Hence, the entries in the r-th row of C add up to 0. 

Recall that the sum of all columns of B is the zero vector. In other words, 
in each row of B, the entries add up to 0. The matrix C therefore also has this 
property (because the only row of C that differs from the corresponding row 
of B is the r-th row; however, we have shown above that in the r-th row, the 
entries of C also add up to 0). In other words, the sum of all columns of C is 
the zero vector. This easily entails that detC = 0 

On the other hand, Laplace expansion along the r-th row yields 


n 
detC = §_ (-1)'** Cpg det (Curt) 
k=1 


= (1) det (Cape) 1) det (Cina) 


(since the only nonzero entries C, x in the r-th row of C are C;; = 1 and C; = 
—1). Comparing this with det C = 0, we obtain 


0 = (—1)™ 1 det Cy) 4H 1) (—1) det (Ca) 
(1) det aae del C a) 
—— — 
wey Gb toy Gi 
= (—1)"** det Bas) — (—1)"** det (Bay,wt) « 


In other words, (—1)"*' det (Bar) = (—1)'** det (Bx;,.s). Dividing both 
sides of this by (—1)"*', we obtain det (Bx;,.;) = (—1)° ‘det (Bx,~s). This 
proves Lemma [9.18.2 O 


Our next lemma is the following generalization of Theorem [5.14.7 


Theorem 5.18.3 (Matrix-Tree Theorem, off-diagonal version). Let D = 
(V,A,) be a multidigraph. Assume that V = {1,2,...,n} for some pos- 
itive integer n. 

Let L be the Laplacian of D. Let r and s be two vertices of D. Then, 


(# of spanning arborescences of D rooted to r) = (—1)’** det (Lx;~s) - 


We are using the notation C, x for the entry of C in the r-th row and the k-th column. 

”6 Proof. It is well-known that the determinant of a matrix does not change if we add a column 
to another. Hence, the determinant of C will not change if we add each column of C other 
than the first one to the first column of C. However, the result of this operation will be 
a matrix whose first column is 0 (since the sum of all columns of C is the zero vector), 
and therefore this matrix will have determinant 0. Since the operation did not change the 
determinant, we thus conclude that the determinant of C was 0. In other words, detC = 0. 
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Note that Theorem |5.14.7}is the particular case of Theorem |5.18.3) for s = r. 
Fortunately, using Lemma|5.18.2) we can easily derive the general case from the 
particular: 


Proof of Theorem [5.18.3] We have seen (in the proof of Proposition [5.14.6) that 
the sum of all columns of the Laplacian L is the zero vector. Hence, Lemma 
5.18.2} (applied to K = Q and B = L and t = r) yields 


det (Larr) = (1) det Eee) 1) det gcc) 
=(-1)"" 
However, the Matrix-Tree Theorem (Theorem b.14.7) yields 
(# of spanning arborescences of D rooted to r) = det (Lar ~r) 
= (1 det Lapua): 
This proves Theorem |5.18.3 E 
We are now ready to prove Theorem [5.18.1 


Proof of Theorem [5.18.1] For each r,s € {1,2,...,n}, we have 


t(D,r) = (# of spanning arborescences of D rooted to r) 
(by the definition of t (D,r)) 
=f =21)) det Lape) (31) 
(by Theorem |5.18.3). 


However, we have f = (t(D,1),T(D,2),...,7(D,n)). Thus, for each s € 
{1,2,...,n}, the s-th entry of the column vector fL ig” 


n 
D t(D,r) Lrs 
— 


(—1)"* det(L~r ~s) 
(by BI) 


Il 
M= 


(—1)™ det (Lors) Lrs 


~ 
Il 
mà 


| 
M= 
— 
| 
= 
x 
& 


TS Lps det (Lars) = det L 


r=1 
since Laplace expansion along the s-th column 
n 
yields det L = X (—1)"**L;s det (Lar ~s) 
r=1 
=0 


(by Proposition .14.6). This shows that all entries of fL are 0. In other words, 
fL = 0. Theorem [.18.1]is thus proved. O 


77We are using the notation L,s for the entry of the matrix L in the r-th row and the s-th 
column. 
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Other proofs of Theorem [5.18.1 exist. In particular, a combinatorial proof is 
sketched in Theorem 1]. (More precisely, [Sahi14| Theorem 1] in this 
paper is the claim of Theorem upon reversing all the arcs and replacing 
all matrices by their transposes. 


5.19. A weighted Matrix-Tree Theorem 
5.19.1. Definitions 


We have so far been counting arborescences. A natural generalization of count- 
ing is weighted counting — i.e., you assign a certain number (a “weight”) to 
each arborescence (or whatever object you are interested in), and then you sum 
the weights of all arborescences (instead of merely counting them). This gener- 
alizes counting, because if all weights are 1, then you get the # of arborescences. 

If you pick the weights to be completely random, then the sum won’t usually 
be particularly interesting. However, some choices of weights lead to good 
behavior. Let us see what we get if we assign a weight to each arc of our 
digraph, and then define the weight of an arborescence to be the product of the 
weights of the arcs that appear in this arborescence. 


Definition 5.19.1. Let D = (V, A, y) be a multidigraph. 

Let K be a commutative ring. Assume that an element w, € K is assigned 
to each arc a € A. We call this w, the weight of the arc a. (You can assume 
that IK = R, so that the weights are just numbers.) 


(a) For any two vertices i,j € V, we let ajj be the sum of the weights of all 
arcs of D that have source i and target j. 


(b) For any vertex i € V, we define the weighted outdegree deg™ i of i to 
be the sum 
D Wa. 


acA; 
the source of a is i 


(c) If B is a subdigraph of D, then the weight w (B) of B is defined to be 


the product II Wa. This is the product of the weights of all arcs 
a is an arc of B 
of B. 


(d) Assume that V = {1,2,...,n} for some n € N. The weighted Lapla- 
cian of D (with respect to the weights w,) is defined to be the n x n- 
matrix L” € K"*" (note that the “w” here is a superscript, not an 
exponent) whose entries are given by 


Li = (deg™” i) -[i=j] -—a?, for all i,j € V. 


78] tried to explain this proof in more detail in the solutions to Spring 2018 Math 4707 midterm 
#3 — see the proof of Theorem 0.7 in those solutions; you be the judge if I succeeded. 
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These definitions generalize analogous definitions in the “unweighted case”. 
Indeed, if we take all the arc weights w, to be 1, then the weighted outdegree 
degt” i of a vertex i becomes its usual outdegree deg i, and the weighted Lapla- 
cian L” becomes the usual Laplacian L. The weight w (B) of a subdigraph B 


simply becomes 1 in this case. 


5.19.2. The weighted Matrix-Tree Theorem 
We now can generalize the original MTT (= Matrix-Tree Theorem)" as follows: 


Theorem 5.19.2 (weighted Matrix-Tree Theorem). Let D = (V,A,) be a 
multidigraph. 

Let K be a commutative ring. Assume that an element w, € K is assigned 
to each arc a € A. We call this w, the weight of the arc a. 

Assume that V = {1,2,...,n} for some n € N. Let L” be the weighted 
Laplacian of D. 

Let r be a vertex of D. Then, 


D w (B) = det (L2, ~r) - 


B is a spanning 
arborescence 
of D rooted to r 


Example 5.19.3. Let D be the following multidigraph: 


ô ; and let r = 3. 


Then, D has two spanning arborescences rooted to r. One of the two has arcs 
a and f (and thus has weight w,w,); the other has arcs y and f (and thus 
has weight ww). Hence, 
D w (B) = Wawg + Wywg, (32) 
B is a spanning 
arborescence 
of D rooted to r 


The weighted Laplacian L” is 


Wa 4 Wy Wy Wy 
L” -= 0 WB = WB 
—Ws 0 Ws 


79To remind: The original MTT is Theorem|5.14.7 
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(since, for example, degt” 1 = Wg + wy and ay) = 0 and aÑo = Wg). Thus, 


WytWy —W 
Tg 3 = ( Fa á ) and therefore 
á 0 WB 


det (L233) = (Wa + Wy) Wg = Wap + Wy Wp. 


The right hand side of this agrees with that of (32). This confirms the 
weighted MTT for our D and r. 


As we already said, the weighted MTT generalizes the original MTT, because 
if we take all w,’s to be 1, we just recover the original MTT. 

However, we can also go backwards: we can derive the weighted MTT from 
the original MTT. Let us do this. 


5.19.3. The polynomial identity trick 


First, we recall a standard result in algebra, known as the principle of perma- 
nence of polynomial identities or as the polynomial identity trick (it also goes 
under several other names). Here is one incarnation of this principle: 


Theorem 5.19.4 (principle of permanence of polynomial identities). Let 
P (x1,X2,...,Xm) and Q (X1,%2,...,%m) be two polynomials with integer co- 
efficients in several indeterminates x1, X2, . . ., Xm. Assume that the equality 


P (ki, k2, eagle) = Q (kı, k2, . .-, Km) (33) 


holds for every m-tuple (kı, k2,...,km) € N” of nonnegative integers. Then, 
P (x1, X2,..., Xm) and Q (x1, X2, . . ., Xm) are identical as polynomials (so that, 
in particular, the equality holds not only for every (k4, k2, . . -,km) € N”, 
but also for every (kı,k2,...,km) E€ C”, and more generally, for every 
(ki, k2, ..., km) € K” where K is an arbitrary commutative ring). 


Theorem|9.19.4Jis often summarized as “in order to prove that two polynomi- 
als are equal, it suffices to show that they are equal on all nonnegative integer 
points” (where a “nonnegative integer point” means a point — i.e., a tuple of 
inputs — whose all entries are nonnegative integers). Even shorter, one says 
that “a polynomial identity (i.e., an equality between two polynomials) needs 
only to be checked on nonnegative integers”. For example, if you can prove the 
equality 

(x+y) + (xy) = 2x4 + 12x4? + 2y* 


for all nonnegative integers x and y, then you automatically conclude that this 
equality holds as a polynomial identity, and thus is true for any elements x and 
y of a commutative ring. 
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A typical application of Theorem [5.19.4ļis to argue that a polynomial identity 
you have proved for all nonnegative integers must automatically hold for all 
inputs (because of Theorem [5.19.4). Some examples of such reasoning can be 
found in §2.6.3 and §2.6.4]. A variant of Theorem [5.19.4] is 
Theorem 2.6]; actually, the proof of Theorem 2.6] can be trivially 
adapted to prove Theorem (just replace “nonempty open set in C'” by 
“IN‘”). In truth, there is nothing special about nonnegative integers and the 
set IN; you could replace IN by any infinite set of numbers (or even any suf- 
ficiently large set of numbers, where “sufficiently large” means “more than 
max {deg P, deg Q} many”). See Lemma 2.1] for a fairly general ver- 
sion of Theorem 5.19-4|that includes such cases®4 


5.19.4. Proof of the weighted MTT 
We can now deduce the weighted MTT from the original MTT (Theorem|5.14.7): 


Proof of Theorem|5.19.2| The claim of Theorem |5.19.2) (for fixed D and r) is an 
equality between two polynomials in the arc weights wa. (For instance, in Ex- 
Wy + Wy —W 
ample |5.19.3} this equality is wxwg + wywg = det ( * 0 í a J 
Therefore, thanks to Theorem [5.19.4} it suffices to prove this equality in the 
case when all arc weights w, are nonnegative integers. So let us WLOG assume 
that arc weights w, are nonnegative integers. 
Let us now replace each arc a of D by wa many copies of the arc a (having 
the same source as a and the same target as a). The result is a new digraph D’. 
Here is an example: 


Example 5.19.5. Let D be the digraph 


80To be precise, |Alon02, Lemma 2.1] is not concerned with two polynomials being identical, 
but rather with one polynomial being identically zero. But this is an equivalent question: 
Two polynomials P and Q are identical if and only if their difference P — Q is identically 
zero. 
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and let the arc weights be w, = 2 and wg = 3 and w, = 2. Then, D’ looks as 
follows: 


where «1,2 are the two arcs obtained from «, and so on. 


Now, recall that the digraph D’ has the same vertices as D, but each arc a 
of D has turned into w, arcs of D’. Thus, the weighted outdegree deg?” i of 
a vertex i of D equals the (usual, i.e., non-weighted) outdegree deg* i of the 
same vertex i of D'. Hence, the weighted Laplacian L” of D is the (usual, i.e., 
non-weighted) Laplacian of D’. 

Recall again that the digraph D’ has the same vertices as D, but each arc a 
of D has turned into w, arcs of D’. Thus, each subdigraph B of D gives rise 
to w(B) many subdigraphs of D’ (because we can replace each arc a of B by 
any of the wą many copies of this arc in D’). Moreover, this correspondence 
takes spanning arborescences to spanning arborescence] and we can obtain 
any spanning arborescence of D’ in this way from exactly one B. Hence, 


D w (B) = (# of spanning arborescences of D’ rooted to r) . 
B is a spanning 
arborescence 
of D rooted to r 


Thus, applying the original MTT (Theorem to D’ yields the weighted 
MTT for D (since the weighted Laplacian L” of D is the (usual, i.e., non- 
weighted) Laplacian of D’). This completes the proof of Theorem [5.19.2 
[Remark: Alternatively, it is not hard to adapt our above proof of the original 
MTT to the weighted case.] O 


5.19.5. Application: Counting trees by their degrees 
The weighted MTT has some applications that wouldn’t be obvious from the 


original MTT. Here is one: 


8lMore precisely: Let B be a subdigraph of D, and let B’ be any of the w (B) many subdi- 
graphs of D’ that are obtained from B through this correspondence. Then, B is a spanning 
arborescence of D rooted to r if and only if B’ is a spanning arborescence of D’ rooted to r. 
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Exercise 5.27. Let n > 2 be an integer, and let d1,d2,...,dn be n positive 
integers. An n-tree shall mean a simple graph with vertex set {1,2,...,n} 
that is a tree. We know from Corollary that there are n”~* many n- 
trees. How many of these n-trees have the property that 


degi = di for each vertex i ? 


Solution. The n-trees are just the spanning trees of the complete graph Kn. 

To incorporate the degi = d; condition into our count, we use a generating 
function. So let us not fix the numbers d1,d>,...,d,, but rather consider the 
polynomial 


deg1 deg2 degn 
Pioen = a eda" (34) 
T is a n-tree 


in n indeterminates x1, x2, . . ., Xn (where deg i means the degree of i in T). Then, 
the xih . . x" -coefficient of this polynomial P (x1, X2, ..., Xn) is the # of n- 


trees T satisfying the property that 


degi = di for each vertex i 


P A d 
(because each such n-tree T contributes a monomial eee? tee xin to the sum on 


the right hand side of (34), whereas any other n-tree T contributes a different 
monomial to this sum). 

Let us assign to each edge ij of Kn the weight w;j := x;x;. Then, the definition 
of P (x1,X2,...,Xn) rewrites as follows: 


P (x1, Sieg) = X uT), 


T is an n-tree 


where w (T) denotes the product of the weights of all edges of T. (Indeed, for 
any subgraph T of Kņ, the weight w (T) equals x38 ts e82.. e87 where degi 
means the degree of i in T.) 

We have assigned weights to the edges of the graph Kn; let us now assign the 
same weights to the arcs of the digraph KÞidir, That is, the two arcs (ij,1) and 


(ij,2) corresponding to an edge ij of K, shall both have the weight 
Wija) 5 M2) = Wij = XiX. (35) 


As we are already used to, we can replace spanning trees of K„ by spanning 
arborescences of K?'4'" rooted to 1, since the former are in bijection with the 
latter. Thus, we have 


(# of spanning trees of K,,) 


= (+ of spanning arborescences of K>'4i* rooted to 1) : 


An introduction to graph theory, version August 2, 2023 page 270 


Moreover, since this bijection preserves weights (because of (85)), we also have 


D w (T) = 2 w (B). 


T is a spanning B is a spanning 
tree of Kn arborescence of Kbidir 
rooted to 1 


In other words, 


} w(T)= 2 w (B) 
T is an n-tree B is a spanning 


arborescence of Kbidir 
rooted to 1 


(since the spanning trees of K, are precisely the n-trees). 

To compute the right hand side, we shall use the weighted Matrix-Tree The- 
orem. The weighted Laplacian of K>'4i" (with the weights we have just defined) 
is the n x n-matrix L” with entries given by 


L} = (degt” i) - [i = j] — ai, 


+W ; w TEER 
7 deg” i—ay, ifi= j; 
E w . . . 
=A} ifi Aj 
deg’ i, ifi= j; since aj’; = 0 when i = j 
a, ift A j (because Kbidit has no loops) 
7 xi (x1 eee ky) xix; ifi= j; 
—XiXj, if í F j 
since degt” i = xix1 + xix +-+ + Xixia + XXi Fe H XiXn 
= xi (x1 + x2 +--+ + Xi-1 + Xi+1 -+++Xn) 
= x; (x1 + X2 ++ + Xn) — Xixi 
= x; (x1 + x2 +: + Xn) — xix; whenever i = j, 
ic ce gate 4 
and since a} i = UX; whenever i Æ j 
= [i = j| x; (x1 + x2 + +++ + Xn) — xixi 
=k- teea) 


We can find its minor det (Ersa) without too much trouble (e.g., using row 


transformations similar to the ones we have done back in the proof of Cayley’s 
formuld®4); the result is 


det (Ley 01) = X1X2 coe Xn (x1 + X2 + eine + Ma š 


82The first step, of course, is to factor an x; out of the i-th row for each i. 
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Summarizing what we have done so far, 


POS Y, aT L w (B) 


T is an n-tree B is a spanning 
arborescence of Khidir 
rooted to 1 
= det (L211) (by the weighted Matrix-Tree Theorem) 
=y ata a (36) 


As we recall, we are looking for the xii yf ..- xn -coefficient in this polynomial. 


From (86), we see that 
(the xe xe vee x" -coefficient of P (x1, X2... sa) 
= (the xfi xf ee x% -coefficient of x1xX2 -+ Xn (x1 X2 +--+ T 
= (the xA lyh -- + xf l coefficient of (x1 + x2 +-+- + xn)" =?) 


(because when we multiply a polynomial by x1Xx2--- Xn, all the exponents in it 
get incremented by 1, so its coefficients just shift by a 1 in each exponent). 
Now, how can we describe the coefficients of (x1 + x2 +--+ a. or, 
more generally, of (x1 + x2 +-+ +xn)” for some m € N ? These are the 
so-called multinomial coefficients (named in analogy to the binomial coeffi- 
cients, which are their particular case for n = 2). Their definition is as follows: 


If pı, P2,- --,Pn,q are nonnegative integers with g = pı + p2+---+ Pn, then 
! 
the multinomial coefficient ( 1 ) is defined to be ——/-—__. If 
P1 Pee -Pn pı!p2!: +: pn! 


q # pı + p2 +: + pn, then it is defined to be 0 instead. In either case, this 
coefficient is easily seen to be an integer] The multinomial formula (aka 
multinomial theorem) says that for each k € IN, we have 


k k hi, i 
(xy +x. +: + Xn) = D (, D o 
iiz ineEN; VEZ eein 
i tist .-Hin=k 


= J a | ee a 
E $ š 14 19 sian ly 
ii ,12,..in EIN coat É 


(it does not matter whether we restrict the sum by the condition 7; + i2 +- + 
in = k or not, since the coefficient (, ; ) is defined to be 0 when this 


. . : 11,12,- . -In 
condition is violated anyway). Hence, 


i, k 
(the Me ++ xi"-coefficient of (x1 + x2 +--+ xn)") = ¢ j j ) 
1712,- -sln 


83See [23wd] Lecture 18, Section 4.12] for an introduction to multinomial coefficients. 
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for any k € N and any 11, i2,...,1n E€ N. In particular, 


(the a aes .--x4n—1 coefficient of (x1 + x2 +--+ a 


_ n—2 
Ne Wy d —1, ..., de Sy 


Summarizing, we find 


(the sting? ee x4" coefficient of P (x1, X2,..., Xn)) 

= (the mee ia ..-x4n~1 coefficient of (x1 + x2 +--+ xn") 

= n= 2 

Nip 1, d2—1, ..., hy 
However, the giye . . x?" -coefficient of P (x1, X2,..., Xn) is the # of n-trees T 
satisfying the property that 

degi = di for each vertex i 

(as we have seen above). Thus, we have proved the following: o 


Theorem 5.19.6 (refined Cayley’s formula). Let n > 2 be an integer, and let 
dı, dz, ...,dn be n positive integers. Then, the # of n-trees with the property 
that 

degi = di for each i € {1,2,...,n} 


is the multinomial coefficient 


n—2 
dı — 1, d — 1, ..., d,-—1]° 


5.19.6. The weighted harmonic vector theorem 


The harmonic vector theorem for Laplacians (Theorem[b.18.1) also has a weighted 
version: 


Theorem 5.19.7 (harmonic vector theorem for weighted Laplacians). Let D = 
(V,A,w) be a multidigraph, where V = {1,2,...,n} for some n € N. Let 
K be a commutative ring. Assume that an element wa € K is assigned to 
each arc a € A. For each r € V, let t” (D,r) be the sum of the weights of 
all the spanning arborescences of D rooted to r. Let f” be the row vector 
(t® (D,1), T” (D,2), ..., T’(D,n)). Let L” be the weighted Laplacian of 
D. Then, fi = 0. 


Proof. Similar to the unweighted case. O 


Here ends our study of spanning trees and their enumeration. An interested 


reader can learn more from |Rubey00], [Holzer22], [Moon70] and [GrSaSu14]. 
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6. Colorings 


Now to something different: Let’s color the vertices of a graph! 


6.1. Definition 


This is a serious course, so our colors are positive integers. Coloring the vertices 
thus means assigning a color (= a positive integer) to each vertex. Here are the 
details: 


Definition 6.1.1. Let G = (V, E, pọ) be a multigraph. Let k € N. 


(a) A k-coloring of G means a map f : V —> {1,2,...,k}. Given such a 
k-coloring f, we refer to the numbers 1,2,...,k as the colors, and we 
refer to each value f (v) as the color of the vertex v in the k-coloring f. 


(b) A k-coloring f of G is said to be proper if no two adjacent vertices of 
G have the same color. (In other words, a k-coloring f of G is proper if 
there exists no edge of G whose endpoints u and v satisfy f (u) = f (v).) 


Example 6.1.2. Here are two 7-colorings of a graph: 


(where the numbers on the nodes are not the vertices, but rather the colors 
of the vertices). The 7-coloring on the left (yes, it is a 7-coloring, even though 
it does not actually use the colors 3, 6 and 7) is not proper, because the two 
adjacent vertices on the top left have the same color. The 7-coloring on the 
right, however, is proper. 
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Example 6.1.3. Here is a bunch of graphs: 


Which of them have proper 3-colorings? 


The graph A has a proper 3-coloring. For example, the map f that 
sends the vertices 1,2,3,4,5 to the colors 1,2,1,2,3 (respectively) is a 
proper 3-coloring. 


The graph B has no proper 3-coloring. Indeed, the four vertices 2,3,4,5 
are mutually adjacent, so they would have to have 4 distinct colors in a 
proper k-coloring; but this is not possible unless k > 4. 


The graph C has a proper 3-coloring and even a proper 2-coloring (e.g., 
assigning color 1 to each odd vertex and color 2 to each even vertex). 


The graph D has no proper 3-coloring and, in fact, no proper k-coloring 
for any k € IN. The reason is that the vertex 3 is adjacent to itself, but 
obviously has the same color as itself no matter what the k-coloring is. 
More generally, a graph with a loop cannot have a proper k-coloring 
for any k € N. 


Example 6.1.4. Let n € N. The n-hypercube Qn (introduced in Definition 
2.14.7) has a proper 2-coloring: Namely, the map 


f:{0,1¥ > {1,2}, 


1, ifaj} +az+--- + anis odd; 
(a1,42,...,An) > 


2, ifa,+a.+---+dy is even 


is a proper 2-coloring of Qn. (Check this! It boils down to the fact that if two 
bitstrings (a1,42,...,ay,) and (bj,b2,...,b,) differ in exactly one entry, then 
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the corresponding sums a1 + 42 + -+ + an and bı + b2 + -+ + bn differ by 


exactly 1.) 


Example 6.1.5. Let n and m be two positive integers. The Cartesian product 
Py x Pm of the n-th path graph P, and the m-th path graph Pm is known as 
the (n, m)-grid graph, as it looks as follows: 


ia 
z 


This (n, m)-grid graph P, x Pm has a proper 2-coloring: namely, the map that 


1, 


sends each vertex (i, j) to s 


if i+ 
if i+ 


- j is even; 


+ j is odd. 


This 2-coloring is called the “chessboard coloring” for a fairly obvious 
reason (view each vertex as a square of a chessboard). 

More generally, if G and H are two simple graphs each having a proper 
2-coloring, then their Cartesian product G x H has a proper 2-coloring as 
well. (See Exercise [6.4] for the proof.) 
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Example 6.1.6. Here is the Petersen graph (as defined in Subsection [2.6.3) : 


I claim that it has a proper 3-coloring. Can you find it? 


As we see, some graphs have proper 3-colorings, while others don’t. Clearly, 
having 4 mutually adjacent vertices makes a proper 3-coloring impossible (in- 
deed, by the pigeonhole principle, two of them must have the same color), but 
this is far from an “if and only if”. The question of determining whether a 
given graph has a proper 3-coloring is NP-complete. 


6.2. 2-colorings 


In contrast, the existence of proper 2-colorings is a much simpler question. The 
following is a nice criterion: 


Theorem 6.2.1 (2-coloring equivalence theorem). Let G = (V,E,g) be a 
multigraph. Then, the following three statements are equivalent: 


e Statement B1: The graph G has a proper 2-coloring. 
e Statement B2: The graph G has no cycles of odd length. 
e Statement B3: The graph G has no circuits of odd length. 
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To prove this theorem, we will need a fact that is somewhat similar to Propo- 
sition 3.3.14 


Proposition 6.2.2. Let G be a multigraph. Let u and v be two vertices of 
G. Let w be an odd-length walk from u to v. Then, w contains either an 
odd-length path from u to v or an odd-length cycle (or both). 


Here, we are using the following rather intuitive terminology: 


e A walk is said to be odd-length if its length is odd. 


e A walk w is said to contain a walk v if each edge of v is an edge of 
w. (This does not necessarily mean that v appears in w as a contiguous 
block.) 


e We remind the reader once again that a “circuit” just means a closed walk 
to us; we impose no further requirements. 


Example 6.2.3. Consider the following simple graph (which we treat as a 
multigraph): 


(a) The odd-length walk (1, *,2, *,3, *,4, *,5, *,2, *,6,*,7) (we are using 
asterisks for the edges, since they can be trivially recovered from the vertices) 
contains the odd-length path (1, ,2, x, 6, *,7) from 1 to 7. 

(b) The odd-length walk (3, *,2, x, 1, *, 6, *,2, *,3) contains the odd-length 
cycle (2, *,1, *,6,*,2). 


Proof of Proposition We apply strong induction on the length of w. 

Thus, we fix ak € N, and we assume (as the induction hypothesis) that 
Proposition [6.2.2]is already proved for all odd-length walks of length < k. Now, 
we must prove it for an odd-length walk w of length k. 

Write this walk w as w = (wọ, *, W1, *, W2,...,*, Wx). Hence, k is the length 
of w, and thus is odd. 

We must prove that w contains either an odd-length path from u to v or an 
odd-length cycle. 

If w itself is a path, then we are done. So WLOG assume that w is not a path. 
Thus, two of the vertices wo, W1, ..., Wx Of w are equal. In other words, there 
exists a pair (i,j) of integers i and j with 0 <i < j < k and w; = w;. Among 
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all such pairs, we pick one with minimum difference j — i. Then, the vertices 
Wj, Wj41,---,W;—1 are distinct (since j — i is minimum). 
Let c be the part of w between w; and w;; thus$4 


c = (Wj, *, Wi41, *,---, *, Wj). 


This c is clearly a closed walk (since w; = wj). If j — i is odd, then this closed 
walk c is a cycle (indeed, its vertices w;, Wi+1, . - ., Wj—1 are distinct, and therefore 
its edges are distinct as well®), and thus we have found an odd-length cycle 
contained in w (namely, c is such a cycle, since its length is j — i, which is odd). 
This means that we are done if j — 7 is odd. 

Thus, we WLOG assume that j — i is even. Hence, cutting out the closed walk 


84Here is an illustration (which, however, is a bit simplistic: the walk w can intersect itself 
arbitrarily many times, not just once as shown here): 


The blue edges here form the walk c. 
85For the very skeptical, here is a proof of this: 

Assume (for the sake of contradiction) that the walk c has two equal edges. Let the first 
of them be an edge between wp and wy, , and let the second be an edge between wq and 
Wq+1, for some distinct elements p and gq of {i,i+1,...,j — 1}. Since equal edges have equal 
endpoints, we thus have { wp, Wp41} = {Wq,Wq41}, so that wp E (Wp, Wp41} = (Wg, Wq41}- 
In other words, wp equals either wq or wg+1. Since Wp Æ wq (because wi, Wj+1,-- ., W;_1 are 
distinct), this entails that wp = w,41. Similarly, wq = Wp41. 

However, p and q are distinct. Thus, at least one of p and q is distinct from j — 1. We 
WLOG assume that q # j — 1 (otherwise, we can swap p with q). Hence, g+1 Æ j, so 
thatq+1e {i,i+1,...,j—1}. Thus, from wp = Wq+1, we conclude that p = q + 1 (since 
W;,Wj41,---,W;_1 are distinct). Thus, p = q+1 > q, so that p+1 > p > q and therefore 
p+1 # q. However, wq = wy41. If p +1 was an element of {i,i+1,...,j—1}, then this 
would entail q = p + 1 (since wi, Wi+1, . - -, Wj—1 are distinct), which would contradict p + 1 x 
q. Thus, p +1 cannot be an element of {i,i+1,...,j/—1}. Hence, p+1 = j (since p+1 
clearly belongs to {i,i+1,...,j}). Thus, wp41 = wj = w;, so that w; = Wy41 = Wy. This 
entails i = q (since Wj, Wi+1,. - -, Wj—1 are distinct). Hence, i = q = p — 1 (since p = q + 1). 
Therefore, j — _ i |= (p+1)-— (p-— 1) =2. This contradicts the fact that j — i is odd. 

we o 


= p+1 =p =] 
This contradiction shows that our assumption (that the walk c has two equal edges) was 
false. Hence, the edges of c are distinct. 
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c from the original walk w, we obtain a walk 


w = (wo, x, W1, X; ..., K, Wi = Wj, *, Wi+d, *, Wj+2, oe ., Wk) 

from u to v. This new walk w’ has length k — (j —i), which is odd (since k is 
odd but j —i is even) and smaller than k (since i < j). Hence, we can apply the 
induction hypothesis to this walk w’. As a consequence, we conclude that this 
walk w’ contains either an odd-length path from u to v or an odd-length cycle. 
Therefore, the walk w also contains either an odd-length path from u to v or an 
odd-length cycle (since anything contained in w’ is automatically contained in 
w). But this is precisely what we set out to prove. This completes the induction 
step, and so we have proved Proposition [6.2.2] O 


Now, let us prove the 2-coloring equivalence theorem: 


Proof of Theorem We shall prove the implications B1 —> B2 => B3 => 
B1. 


Proof of the implication B1 — B2: Assume that Statement B1 holds. We must 
prove that Statement B2 holds. 

We have assumed that B1 holds. In other words, the graph G has a proper 
2-coloring. Let f be this 2-coloring. Thus, f is a map from V to {1,2} such that 
any two adjacent vertices x and y of G satisfy f (x) Æ f (y). 

Assume (for contradiction) that G has a cycle of odd length. Let 


(vo, x, U1, *, 02, ¥;...}; *, Uk) 


be this cycle. Thus, k is odd, and we have vy = vo, so that f (v) = f (vo). 
Moreover, for each i € {1,2,...,k}, the vertex v; is adjacent to v;_1 (since 
(Vo, *, U1, *, U2, ¥,.-+,*, Ux) is a cycle) and therefore satisfies 


f (ei) F f (i-1) (37) 


(since f is a proper 2-coloring). 

We WLOG assume that f (vo) = 1 (otherwise, we “rename” the colors 1 
and 2 so that the color f (vo) becomes 1). Then, (applied to i = 1) yields 
f (v1) # f (vo) = 1, so that f (v1) = 2 (since f (vı) must be either 1 or 2). 
Hence, (applied to i = 2) yields f (v2) Æ f (v1) = 2, so that f (v2) = 1 
(since f (v2) must be either 1 or 2). For similar reasons, we can successively 
obtain f (v3) = 2 and f (v4) = 1 and f (v5) = 2 and so on. The general formula 
we obtain (strictly speaking, it needs to be proved by induction on i) says that 


1, ifiis even; 
i ú 7 f hi€40,1,...,k}. 
F (vi) i: if iis odd pee } 


Applying this to i = k, we conclude that f (v) = 2 (since k is odd). However, 
this contradicts f (v) = f (vo) = 1 # 2. This contradiction shows that our 
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assumption was false. Hence, G has no cycle of odd length. In other words, 
Statement B2 holds. This proves the implication B1 — > B2. 


Proof of the implication B2 — B3: Assume that Statement B2 holds. We must 
prove that Statement B3 holds. In other words, we must show that G has no 
odd-length circuits. 

Assume the contrary. Thus, G has an odd-length circuit w. Let u be the 
starting and ending point of w. Thus, Proposition (applied to v = u) 
shows that this odd-length circuit w contains either an odd-length path from 
u to u or an odd-length cycle. Since G has no odd-length cycle (because we 
assumed that Statement B2 holds), we thus concludes that w contains an odd- 
length path from u to u. However, an odd-length path from u to u is impossible 
(since the only path from u to u has length 0). Thus, we obtain a contradiction, 
which shows that G has no odd-length circuits. This proves the implication B2 
= B3. 


Proof of the implication B3 —> B1: Assume that Statement B3 holds. We must 
prove that Statement B1 holds. 

We have assumed that Statement B3 holds. In other words, G has no odd- 
length circuits. We must find a proper 2-coloring of G. 

We WLOG assume that G is connected (otherwise, let C1, C2, . . ., Cg be the 
components of G, and apply the implication B3 => B1 to each of the smaller 
graphs G [C1], G[C2], ..., G [C], and then combine the resulting proper 2- 
colorings of these smaller graphs into a single proper 2-coloring of G). Fix any 
vertex r of G. Define a map f : V — {1,2} by setting 


for each v € V 


1, ifd(v,r) is even; 
Po Tale) 
2, ifd(v,r) is odd 


(where d (v,r) denotes the distance from v to r, that is, the smallest length of a 
path from v to r). 
I claim that f is a proper 2-coloring 84 Indeed, assume the contrary. Thus, 


86Here is an illustrative example: 


(Of course, the numbers on the nodes here are not the vertices, but rather the colors of these 
vertices.) 

Note that all values of f can be easily found by the following recursive algorithm: Start 
by assigning the color 1 to r. Then, assign the color 2 to all neighbors of r. Then, assign the 
color 1 to all neighbors of these neighbors (unless they have already been colored). Then, 
assign the color 2 to all neighbors of these neighbors of these neighbors, and so on. 
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some two adjacent vertices u and v have the same color f (u) = f (v). Consider 
these u and v. Since f (u) = f (v), we are in one of the following two cases: 

Case 1: We have f (u) = f (v) = 1. 

Case 2: We have f (u) = f (v) = 2. 

Let us consider Case 2. In this case, we have f (u) = f (v) = 2. This means 
that d(u,r) and d(v,r) are both odd (by the definition of f). Hence, there 
is an odd-length path p from u to r and an odd-length path q from v to r. 
Consider these p and q. Also, there is an edge e that joins u and v (since u 
and v are adjacent). Consider this edge e. By combining the paths p and q 
and inserting the edge e into the result, we obtain a circuit from r to r (which 
starts by following the path p backwards to u, then takes the edge e to v, then 
follows the path q back to r). This circuit has odd length (since p and q have 
odd lengths, and since the edge e adds 1 to the length). Thus, we have found 
an odd-length circuit of G. However, we assumed that G has no odd-length 
circuits. Contradiction! 

Thus, we have found a contradiction in Case 2. Similarly, we can find a 
contradiction in Case 1. Thus, we always get a contradiction. This shows that 
f is indeed a proper 2-coloring. Thus, Statement B1 holds. This proves the 
implication B3 => B1. 


For aesthetical reasons, let me give a second proof of the implication B3 =—> B1, which 
avoids the awkward “break G up into components” step: 

Assume again that Statement B3 holds. We must prove that Statement B1 holds. 

We assumed that Statement B3 holds. In other words, G has no odd-length cycles. 

Two vertices u and v of G will be called oddly connected if G has an odd-length path 
from u to v. By Proposition|6.2.2) this condition is equivalent to “G has an odd-length 
walk from u to v”, since G has no odd-length cycles. Moreover, a vertex u cannot be 
oddly connected to itself (since the only path from u to u is the trivial length-0 path 
(u), which is not odd-length). 

A subset A of V will be called odd-path-less if no two vertices in A are oddly 
connected. (Note that “two vertices” doesn’t mean “two distinct vertices”.) 

Pick a maximum-size odd-path-less subset A of V (such an A exists, since Ø is 
clearly odd-path-less). Now, let f : V — {1,2} be the 2-coloring of G that assigns the 
color 1 to all vertices in A and assigns the color 2 to all vertices not in A. 

We shall show that this 2-coloring f is proper. 

To prove this, we must show that no two adjacent vertices have color 1 and that no 
two adjacent vertices have color 2. The first of these two claims is obvioug®§), It thus 
remains to prove the second claim — i.e., to prove that no two adjacent vertices have 


87Note that this proof provides a reasonably efficient algorithm for constructing a proper 2- 
coloring of G, as long as you know how to compute distances in a graph (we have done 
this, e.g., in homework set #4 exercise 5) and how to compute the components of a graph 
(this is not hard). 

88 Proof. An edge always makes a walk of length 1, which is odd. Thus, two adjacent vertices 
are automatically oddly connected. Hence, two adjacent vertices cannot both be contained 
in the odd-path-less subset A. In other words, two adjacent vertices cannot both have color 
1. 
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color 2. 

Assume the contrary. Thus, there exist two adjacent vertices u and v that both have 
color 2. Consider these u and v. These vertices u and v have color 2; in other words, 
neither of them belongs to A. 

The vertex u is not oddly connected to itself (as we already saw). Hence, the vertex 
u is oddly connected to at least one vertex a € A (because otherwise, we could insert u 
into the odd-path-less set A and obtain a larger odd-path-less subset A U {u} of V; but 
this would contradict the fact that A is a maximum-size odd-path-less subset of V). For 
similar reasons, the vertex v is oddly connected to at least one vertex b € A. Consider 
these vertices a and b. Since u is oddly connected to a, there exists an odd-length walk 
p from u to a. Reversing this walk p yields an odd-length walk p’ from a to u. Since v 
is oddly connected to b, there exists an odd-length walk q from v to b. Finally, there is 
an edge e with endpoints u and v (since u and v are adjacent). Combine the two walks 
p’ and q and insert this edge e between them; this yields a walk from a to b (via u and 
v) that has odd length (since p’ and q have odd length each, and inserting e adds 1 to 
the length). Thus, G has an odd-length walk from a to b. In other words, the vertices 
a and b are oddly connected. This contradicts the fact that the set A is odd-path-less 
(since a and b belong to A). 

This contradiction shows that our assumption was false. Thus, we have shown that 
no two adjacent vertices have color 2. This completes our proof that f is a proper 2- 
coloring. Thus, Statement B1 holds. This proves the implication B3 —> B1 once again. 


Having proved all three implications B1 — > B2 and B2 => B3 and B3 => 
B1, we now conclude that the three statements B1, B2 and B3 are equivalent. 
This proves Theorem [6.2.1 O 


Remark 6.2.4. A graph G that satisfies the three equivalent statements B1, B2, 
B3 of Theorem is sometimes called a “bipartite graph”. This is slightly 
imprecise, since the proper definition of a “bipartite graph” is (equivalent to) 
“a graph equipped with a proper 2-coloring”. Thus, if we equip one and 
the same graph G with different proper 2-colorings, then we obtain different 
bipartite graphs. We shall take a closer look at bipartite graphs in Sections 
and 


A further simple property of proper 2-colorings is the following | 


Proposition 6.2.5. Let G be a multigraph that has a proper 2-coloring. Then, 
G has exactly 2°°""© many proper 2-colorings. 


Proof sketch. For each component C of G, let us fix an arbitrary vertex rc € C. 
When constructing a proper 2-coloring f of G, we can freely choose the colors 
f (rc) of these vertices rc; the colors of all other vertices are then uniquely 
determined (see the first proof of the implication B3 —> B1 in our above proof 
of Theorem [6.2.1] for the details). Thus, we have 2°" many options (since G 
has conn G many components). The proposition follows. O 


5°Recall that conn G denotes the number of components of a graph G. 
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6.3. The Brooks theorems 


As we said, the existence of a proper k-coloring for a given graph G is a hard 
computational problem unless k < 2. The same holds for theoretical criteria: 
For k > 2, lam not aware of any good criteria that are simultaneously necessary 
and sufficient for the existence of a proper k-coloring. However, some sufficient 
criteria are known. Here is one% 


Theorem 6.3.1 (Little Brooks theorem). Let G = (V, E, ọ) be a loopless multi- 
graph with at least one vertex. Let 


& := max {degv | ve V}. 
Then, G has a proper (« + 1)-coloring. 


Proof sketch. Let v1, v2, . . ., Un be the vertices of V, listed in some order (with no 
repetitions). We construct a proper («œ + 1)-coloring f : V —> {1,2,...,a+1} of 
G recursively as follows: 


e First, we choose f (v1) arbitrarily. 


e Then, we choose f (v2) to be distinct from the colors of all already-colored 
neighbors of v2. 


e Then, we choose f (v3) to be distinct from the colors of all already-colored 
neighbors of v3. 


e Then, we choose f (v4) to be distinct from the colors of all already-colored 
neighbors of v4. 


e And so on, until all values f (v1), f (v2), ..., f (On) have been chosen. 


Why do we never run out of colors in this process? Well: When choosing f (v;), 
we must choose a color distinct from the colors of all already-colored neighbors 
of vi. Since v; has at most « neighbors (because deg (v;) < «), this means that 
we have at most a colors to avoid. Since there are «+1 colors in total, this 
leaves us at least 1 color that we can choose; therefore, we don’t run out of 
colors. 

The resulting («+ 1)-coloring f : V — {1,2,...,a+1} is called a greedy 
coloring. This (« + 1)-coloring f is indeed proper, because if an edge has end- 
points v; and v; with i > j, then the construction of f (v;) ensures that f (v;) is 
distinct from f (v;). (Note how we are using the fact that G is loopless here! If 
G had a loop, then the endpoints of this loop could not be written as v; and 0; 
with i > j.) o 


Recall that a multigraph is called loopless if it has no loops. 


An introduction to graph theory, version August 2, 2023 page 284 


In general, the a + 1 in Theorem cannot be improved. Here are two 
examples: 


e Ifn > 2, then the cycle graph Cy, has maximum degree 
a = max {degv | v € V} =2. Thus, Theorem shows that C, has a 
proper 3-coloring. When n is even, Cn has a proper 2-coloring as well, but 
this is not the case when n is odd (by Theorem {6.2.1}. 


e Ifn > 1, then the complete graph Kn has maximum degree 
a = max{degv | v€ V} = n—1. Thus, Theorem [6.3.1] shows that Kn 
has a proper n-coloring. By the pigeonhole principle, it is clear that Ky 
has no proper (n — 1)-coloring. 


Interestingly, these two examples are in fact the only cases when a con- 
nected loopless multigraph with maximum degree « can fail to have a proper 
«-coloring. In all other cases, we can improve the a + 1 to a: 


Theorem 6.3.2 (Brooks theorem). Let G = (V, E, p) be a connected loopless 
multigraph. Let 

& := max {degv | vE V}. 
Assume that G is neither a complete graph nor an odd-length cycle. Then, G 
has a proper «-coloring. 


Proof. Despite the seemingly little difference, this is significantly harder to 
prove than Theorem Various proofs can be found in [CraRab15] and 


in most serious textbooks on graph theory. O 


6.4. Exercises on proper colorings 


Exercise 6.1. Let G be a simple graph with n vertices. Let k be a positive 
integer. 
Prove the following: 


(a) If G has a proper k-coloring, then G has no subgraph isomorphic to 
Kk+1- 


(b) If k > n- 2, then the converse to part (a) also holds: If G has no 
subgraph isomorphic to K;,1, then G has a proper k-coloring. 


(c) Does the converse to part (a) hold for k < n — 2 as well? Specifically, 
does it hold for n = 5 and k = 2 ? 


°1See Definition for the proper definition of C, when n = 2. 
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Exercise 6.2. Let G be a connected loopless multigraph. Prove that G has a 
proper 2-coloring if and only if every three vertices u,v,w of G satisfy 


2|d(u,v)+d(v,w)+d(w,u). 


Exercise 6.3. Fix two positive integers n and k with n > 2k > 0. Let S = 
{1,2,...,n}. Consider the k-Kneser graph Ks; as defined in Subsection [2.6.3 
Prove that Ks; has a proper (n — 2k + 2)-coloring. 


[Hint: What can you say about the minima (i.e., smallest elements) of two 
disjoint subsets of S? (Being distinct is a good first step.)] 


[Remark: Lóvasz has proved in 1978 (using topology!) that this result is 


optimal — in the sense that n — 2k + 2 is the smallest integer q such that Ks; 
has a proper q-coloring.] 


Exercise 6.4. Let k € IN. Let G and H be two simple graphs. Assume that 
each of G and H has a proper k-coloring. Prove that the Cartesian product 
G x H (defined in Definition |2.14.10) has a proper k-coloring as well. 


[Remark: It is easy to see that the converse holds as well (i.e., if G x H has 
a proper k-coloring, then so do G and H), provided that the vertex sets V (G) 
and V (H) are both nonempty.] 


Exercise 6.5. Letn € IN. Let G be the n-th coprimality graph Cop, defined 
in Example Let k € IN. Let m be the number of prime numbers in the 
set {1,2,...,n}. Prove the following: 


(a) The graph G has a proper k-coloring if and only if k > m +1. 
(b) The graph G has a subgraph isomorphic to K, if and only if k < m+ 1. 


Exercise 6.6. Let n and k be two positive integers. Let K be a set of size k. 
Let D be the de Bruijn digraph — i.e., the multidigraph constructed in the 
proof of Theorem [5.16.2] Let G be the result of removing all loops from the 
undirected graph D™4, Prove that G has a proper (k + 1)-coloring. 


k(k+1) 


Exercise 6.7. Let k € IN. Let G be a simple graph with fewer than 5 


edges. Prove that G has a proper k-coloring. 


6.5. The chromatic polynomial 


Here is another surprise: The number of proper k-colorings of a given multi- 
graph G turns out to be a polynomial function in k (with integer coefficients). 
More precisely: 
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Theorem 6.5.1 (Whitney’s chromatic polynomial theorem). Let G = (V,E, ọ) 
be a multigraph. Let xg be the polynomial in the single indeterminate x with 
coefficients in Z defined as follows: 


xc= > (—1)/Fl com(V.Fglr) — L (a gE 
FCE H is a spanning 
subgraph of G 


(The symbol “ }, ” means “sum over all subsets F of E”.) 
FCE 
Then, for any k € IN, we have 


(# of proper k-colorings of G) = xg (k) . 


The proper place for this theorem is probably a course on enumerative com- 
binatorics, but let us give here a proof for the sake of completeness (optional 
material). The following proof is essentially due to Hassler Whitney in 1930 
([Whitne32! §6]), and I am mostly copypasting it from my own writeup 
§0.5] (with some changes stemming from the fact that we are here working with 
multigraphs rather than simple graphs). 

We are going to use the Iverson bracket notation: 


Definition 6.5.2. If A is any logical statement, then [A] shall denote the truth 
1, if A is true; 

0, if A is false. 

For instance, [2+ 2 = 4] = 1 and [2 +2 = 5| = 0. 


value of A; this is the number 


We next recall a combinatorial identity ([17s} Lemma 3.3.5]): 


Lemma 6.5.3. Let P be a finite set. Then, 


(-1)'4! = [P = Ø]. 
AGP 


(The symbol “ }, ” means “sum over all subsets A of P”.) 
ACP 


Next, we introduce a specific notation related to colorings: 


Definition 6.5.4. Let G = (V, E, pọ) be a multigraph. Let k € N. Let f : V > 
{1,2,...,k} be a k-coloring. We then define a subset E rf of E by 


Eş := {e € E | the two endpoints of e have the same color in f}. 


(Recall that the “color in f” of a vertex v means the value f (v). If an edge 
e € E is a loop, then e always belongs to Er, since we think of the two 
endpoints of e as being equal.) 
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The elements of Ey are called the f-monochromatic edges of G. 
(“Monochromatic” means “one-colored”, so no surprises here.) 


Example 6.5.5. Let G = (V, E, p) be the following multigraph: 


Let f : V — {1,2} be the 2-coloring of G that sends each odd vertex to 1 and 
each even vertex to 2. (Here, an “odd vertex” means a vertex that is odd as 


an integer. Thus, the odd vertices are 1,3,5. “Even vertices” are understood 
similarly.) Then, Er = {a,b}. 


Notice the following simple fact: 
Proposition 6.5.6. Let G = (V, E, pọ) be a multigraph. Let k € N. Let f : V > 


{1,2,...,k} be a k-coloring. Then, the k-coloring f is proper if and only if 
E; = Ø. 
F 


Proof of Proposition We have the following chain of equivalences: 


(the k-coloring f is proper) 
<= (no two adjacent vertices have the same color) 
(by the definition of “proper”) 
<=> (there is no edge e € E such that the two endpoints of e have the same color) 
( since adjacent vertices are vertices that ) 
are the two endpoints of an edge 
<=> (there exists no element of E,) 


since the elements of E+ are precisely the edges e € E 
such that the two endpoints of e have the same color 
(by the definition of Er) 


<=> (Ey = 9). 


This proves Proposition oO 


An introduction to graph theory, version August 2, 2023 page 288 


Lemma 6.5.7. Let G = (V, E, pọ) be a multigraph. Let B be a subset of E. Let 
k € IN. Then, the number of all k-colorings f : V — {1,2,...,k} satisfying 
BC Eg is KoonnV-B als), 


Proof of Lemma[6.5.7) If C is a nonempty subset of V, and if f : V > {1,2,...,k} 
is any k-coloring of G, then we shall say that f is constant on C if the restriction 
f |c is a constant map (i.e., if the colors f (c) for all c € C are equal). We shall 
show the following claim: 


Claim 1: Let f : V > {1,2,...,k} be any k-coloring of G. Then, we 
have B C Eş if and only if f is constant on each component of the 
multigraph (V,B,¢@ |B). 


[Proof of Claim 1: This is an “if and only if” statement; we shall prove its 
“=” and “<—” directions separately: 

==: Assume that B C Er. We must prove that f is constant on each compo- 
nent of the multigraph (V, B, ọ |B). 

Let C be a component of (V, B, ọ |g). We must prove that f is constant on C. 
In other words, we must prove that f (c) = f (d) for any c,d € C. 

So let us fix c,d € C. Then, the vertices c and d belong to the same component 
of the graph (V,B,¢ |g) (namely, to C). Hence, these vertices c and d are path- 
connected in this graph. In other words, the graph (V,B,@ |g) has a path from 
c to d. Let 

p = (Vo, 1, 01, €2, U2,--+,€s,Vs) 


be this path. Hence, vo = c and vs = d and e1, €2,...,€s € B. 

Let i € {1,2,...,s}. Then, the endpoints of the edge e; are v;_1 and v; (since 
ei is surrounded by v;_; and v; on the path p). However, from e1,é2,...,es € B, 
we obtain e; € BC E f Hence, the two endpoints of e; have the same color in f 
(by the definition of Er). In other words, f (vi-1) = f (v;) (since the endpoints 
of the edge e; are vj_; and v;). 

Forget that we fixed i. We thus have proved the equality f (vi—1) = f (vi) for 
each į € {1,2,...,s}. Combining these equalities, we obtain 


f (00) = f (01) = =s] O) 


Hence, f (vo) = f (vs). In other words, f (c) = f (d) (since vo = c and vs = d). 
Forget that we fixed c and d. We thus have shown that f (c) = f (d) for 
any c,d € C. In other words, f is constant on C. Since C was allowed to be 
an arbitrary component of (V, B, ọ |g), we thus conclude that f is constant on 
each component of the multigraph (V,B,@ |g). This proves the “=>” direction 
of Claim 1. 
<=: Assume that f is constant on each component of the multigraph (V, B, ¢ |B). 
We must prove that B C Ey. 
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Indeed, let e € B. Let u and v be the two endpoints of e. Then, (u,e,v) is a 
walk from u to v in the multigraph (V,B,@ |g) (since e € B). Hence, u is path- 
connected to v in this multigraph. In other words, u and v belong to the same 
component of the multigraph (V,B,9 |g). Therefore, f (u) = f (v) (since f is 
constant on each component of the multigraph (V,B,¢@ |g)). This means that 
the two endpoints of e have the same color in f (since u and v are the endpoints 
of e). Combining this with the fact that e € E (because e € B C E), we conclude 
that e € Er (by the definition of E,). 

Forget that we fixed e. We thus have shown that e € Ey for each e € B. In 
other words, B C E f- This proves the “<—” direction of Claim 1. The proof of 
Claim 1 is now complete.] 


Now, Claim 1 shows that the k-colorings f : V —> {1,2,...,k} satisfying 
B C E, are precisely the k-colorings f : V > {1,2,...,k} that are constant on 
each component of the graph (V,B,@ |g). Hence, all such k-colorings f can be 
obtained by the following procedure: 


e For each component C of the graph (V,B, ọ |g), pick a color cc (that is, 
an element cc of {1,2,...,k}) and then assign this color cc to each vertex 
in C (that is, set f (v) = cc for each v € C). 


This procedure involves choices (because for each component C of (V, B, ọ |g), 
we get to pick a color): Namely, for each of the conn (V, B, p |g) many compo- 
nents of the graph (V,B, |g), we must choose a color from the set {1,2,...,k}. 
Thus, we have a total of k°°""(Y,2l8) many options (since we are choosing 
among k colors for each of the conn (V, B, ọ |g) components). Each of these 
options gives rise to a different k-coloring f : V —> {1,2,...,k}. Therefore, the 
number of all k-colorings f : V > {1,2,...,k} satisfying B C Eç is kconn(V,B,9|B) 
(because all of these k-colorings can be obtained by this procedure). This proves 
Lemma |6.5.7 oO 


Corollary 6.5.8. Let (V,E,g) be a multigraph. Let F be a subset of E. Let 


k € N. Then, 
penn (VE glr) — 1. 


f:V>{1,2,..,k}; 
FCEr 


Proof of Corollary We have 


D L= (the number of all f : V > {1,2,...,k} satisfying F C Er) -1 
f:V>{1,2,..,.k}; 
FCEr 


= (the number of all f : V > {1,2,...,k} satisfying F C Er) 
— ,conn(V,F,¢|F) 
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(because Lemma (applied to B = F) shows that the number of all k- 
colorings f : V — {1,2,...,k} satisfying F C E; is xoomn(VE,9lr)) This proves 
Corollary oO 


Proof of Theorem First of all, the equality 


(—1)F! xconn(V,F,p|r) — L (1) xconn H 
FCE H is a spanning 
subgraph of G 


is clear, because the spanning subgraphs of G are precisely the subgraphs of 
the form (V,F,ọ |r) for some F C E. 
Now, let k € N. We must prove that (# of proper k-colorings of G) = xç (k). 
Let us substitute k for x in the equality 


e= P (—1) FI aea El), 
FCE 
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We thus obtain 


xc (k) 
= _1)/Fl poonn(V.F gle) — _1)IF 
= | (1)! & Aoa yt 
FCE _ y 1 FCE FV >{1,2. k}; 
FV >{1,2,.. k}; PCE 
FCE 
(by Corollary [6.5.8) 
i ane FE on 
FCE f:V3{12,...k}; Of: V{1,2,...k} FCE; 
FCE =(—1)!Fl FOE, 
ets E “A” 
ise ae k} ae Ea 
FCEș (since E,CE) 
ee 
f:V—>{1,2,...k} FCE f:V—>{1,2,...k} ACE 
f f 
iM 
=|E;=2 
(by Lemma[6.5.3] 


applied to P=Ef) 


here, we have renamed the summation index F 
in the inner sum as A 


f:V>{1,2,...,k} 
= os [Ep=2) + È [Er = 2] 
f:V—(1,2,...k}; e f:V>{12,..,k}3 a 
oe (since E =Ø is true) ie ca (since E =Ø is false) 


( since each f : V — {1,2,...,k} either satisfies Er = Ø ) 


or does not 
= >» .o.. 2. 0=. 2» .1 
f:V—>{1,2,..,.k}; f:V>{1,2,..,.k}; f:V>{1,2,..,.k}; 
Ef=O not E= Ej=Ø 
Å——— 
=0 


= (the number of all f : V > {1,2,...,k} such that Ef = Ø) -1 
= (the number of all f : V > {1,2,...,k} such that Er = D) 
= (the number of all f : V —> {1,2,...,k} such that the k-coloring f is proper) 
( since Proposition |[6.5.6|shows that the condition “Eş = Ø” ) 
is equivalent to “the k-coloring f is proper” 
= (the number of all proper k-colorings) . 


In other words, the number of proper k-colorings of G is xg (k). This completes 
the proof of Theorem [6.5.1 O 
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Definition 6.5.9. The polynomial xg in Theorem |6.5.1}is known as the chro- 
matic polynomial of G. 


Here are the chromatic polynomials of some graphs: 


Proposition 6.5.10. Let n > 1 be an integer. 


(a) For the path graph P, with n vertices, we have 


XP, = x(x — p . 


(b) More generally, for any tree T with n vertices, we have 


xr =x(x—1)"". 


(c) For the complete graph Kn with n vertices, we have 
XK, = x(x—1)(x—-2)---(x-—n+4+1). 


(d) For the empty graph En with n vertices, we have 


n 
XE, 57X. 


(e) Assume that n > 2. For the cycle graph C, with n vertices, we have 


Xc, = (x — 1)” + (—1)” (x— 1). 


Proof sketch. (c) In order to prove that two polynomials with real coefficients are iden- 
tical, it suffices to show that they agree on all nonnegative integers (this is an instance 
of the “principle of permanence of polynomial identities” that we have already stated 
as Theorem[.19.4). Thus, in order to prove that xx, = x (x — 1) (x — 2)--- (x= n +1), 
it suffices to show that xx, (k) = k (k — 1) (k — 2) - -- (k— n + 1) for each k € N. 

So let us do this. Fix k € IN. Theorem[6.5.1] (applied to G = Ką„) yields 


(# of proper k-colorings of K,) = xx, (k). (38) 


Now, how many proper k-colorings does K, have? We can construct such a proper 
k-coloring as follows: 


e First, choose the color of the vertex 1. There are k options for this. 


e Then, choose the color of the vertex 2. There are k — 1 options for this, since it 
must differ from the color of 1. 


e Then, choose the color of the vertex 3. There are k — 2 options for this, since it 
must differ from the colors of 1 and of 2 (and the latter two colors are distinct, 
so we must subtract 2, not 1). 
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e And so on, until all n vertices are colored. 


The total number of options to perform this construction is therefore 
k(k—1) (k—2)---(k—n-+1). Hence, 


(# of proper k-colorings of K,) = k (k — 1) (k—2)---(k-—n+1). 


Comparing this with (88), we obtain xx, (k) = k(k—1) (k—2)---(k—n+1). As we 
already explained, this completes the proof of Proposition|6.5.10) (c). 


(d) This is similar to part (c), but easier. We leave the proof to the reader. Alter- 
natively, it follows easily from the definition of xg,, since E, has only one spanning 
subgraph (namely, E, itself). 


(b) (This is an outline; see §0.6] for details.) 

We proceed by induction on n. If n = 1, then this is easily checked by hand. If 
n > 1, then the tree T has at least one leaf (by Theorem [5.3.2] (a)). Thus, we can fix a 
leaf £ of T. The graph T \ £ then is a tree (by Theorem|5.3.3) and has n — 1 vertices, and 
therefore (by the induction hypothesis) its chromatic polynomial is x7\¢ = x (x — i. 
However, for any given k € N, we can construct a proper k-coloring of T by first 
choosing a proper k-coloring of T \ £ and then choosing the color of the remaining 
leaf £ (there are k — 1 choices for it, since it has to differ from the color of the unique 
neighbor of £). Therefore, for each k € IN, we have 


(# of proper k-colorings of T) = (# of proper k-colorings of T \ 4) - (k— 1). 
In view of Theorem[6.5.1] this equality can be rewritten as 
xr (k) = xry (k) (k—1). 
Since this holds for all k € N, we thus conclude that 


xT= xme (x-1) =x(x—1)"?. (x-1) = x(x- 1)". 


This completes the induction step. 

Alternatively, Proposition [6.5.10] (b) can also be derived from the definition of xr, 
using the fact that every spanning subgraph H of T has no cycles and therefore satisfies 
conn H = n — |E (H) | (by Corollary 5.1.7). 


(a) This is a particular case of part (b), since P, is a tree with n vertices. 


(e) There are different ways to prove this; see |LeeShil9] for four different proofs. 
The simplest one is probably by induction on n: Let n > 2. Fix k € N. A proper 
k-coloring of C;, is the same as a proper k-coloring of P, that assigns different colors to 
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the vertices 1 and n. Hence, 
(# of proper k-colorings of Cn) 
= (# of proper k-colorings of P,, that assign different colors to 1 and n) 
= (# of proper k-colorings of P,,) 
N 


=k(k—1)" 
(by part (a)) 


— (# of proper k-colorings of P, that assign the same color to 1 and n) 
N , 


=(# of proper k-colorings of C,—1) 
(why?) 


=k(k—- i aii — (# of proper k-colorings of Cy,_1). 
In view of Theorem this equality can be rewritten as 


Xe, (K) =k(k=1)" — xc, (K): 
Since this holds for all k € IN, we thus obtain 


XCn =x (x ia 1)" ~~ X Cr 


This is a recursion that is easily solved for xc,, yielding the claim of part (e). 


(Proposition [6.5.10] (e) also appeared as Exercise 2 (a) on midterm #3 in my Spring 
2017 course; see|the course website for solutions.) o 


Exercise 6.8. Let g € N. Let G be the simple graph whose vertices are the 
2g + 1 integers —g, —g + 1,...,g — 1,g, and whose edges are 


{0,i} for alli € {1,2,...,g}; 

{0, —i} for alli € {1,2,...,g}; 

{i, —i} for alli € {1,2,...,g} 
(these are 3g edges in total). 


Compute the chromatic polynomial xg of G. 
[Here is how G looks like in the case when g = 4: 
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[Solution: This is Exercise 2 (b) on midterm #3 from my Spring 2017 


course; see the course page for solutions.] 


6.6. Vizing’s theorem 


So far we have been coloring the vertices of a graph. We can also color the 
edges: 


Definition 6.6.1. Let G = (V, E, pọ) be a multigraph. Let k € N. 

A k-edge-coloring of G means a map f : E > {1,2,...,k}. 

Such a k-edge-coloring f is called proper if no two distinct edges that have 
a common endpoint have the same color. 


The most prominent fact about edge-colorings is the following theorem: 


Theorem 6.6.2 (Vizing’s theorem). Let G be a simple graph with at least one 
vertex. Let 
& := max {degv | ve V}. 


Then, G has a proper (« + 1)-edge-coloring. 


Proof. See, e.g., [Schrij04] or various textbooks on graph theory! O 


Two remarks: 


e The a + 1 in Vizing’s theorem cannot be improved in general (e.g., take G 
to be an odd-length cycle graph Cn). 


e Vizing’s theorem can be adapted to work for multigraphs instead of sim- 
ple graphs. However, this requires replacing the a + 1 by a +m, where 
m is the maximum number of distinct mutually parallel edges in G (since 

otherwise, the multigraph (K}idir) wd would be a counterexample, as it 

has « = 4 but has no proper 5-edge-coloring). For a proof of this, see 


|BerFou91} Corollary 2]. 


6.7. Further exercises 


Some interesting things can be said about colorings of graphs, even about non- 
proper colorings: 


*2Note that [Schrij04] uses some standard graph-theoretical notations: What we call a is de- 
noted by A (G) in [Schrij04], whereas x’ (G) denotes the minimum k € IN for which G has 
a proper k-edge-coloring. 
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Exercise 6.9. Let G = (V,E) bea simple graph. 
Prove that there exists a 2-coloring f of G with the following property: For 


each vertex v € V, at most 5 deg v among the neighbors of v have the same 


color as v. 


[Remark: This problem is often restated as follows: You are given a (finite) 
set of politicians; some politicians are mutual enemies. (No politician is his 
own enemy. If u is an enemy of v, then v is an enemy of u. An enemy of 
an enemy is not necessarily a friend. So this is just a simple graph.) Prove 
that it is possible to subdivide this set into two (disjoint) parties such that no 
politician has more than half of his enemies in his own party.] 


[Hint: First, pick an arbitrary 2-coloring f of G. Then, gradually improve 
it until it satisfies the required property.] 


[Solution: This is Exercise 1 on homework set #0 from my Spring 2017 


course; see the course page for solutions. ] 


Exercise |6.9]can be generalized to multiple colors: 


Exercise 6.10. Let k € IN. Let pı, p2,...,px be k nonnegative real numbers 
such that pı + po+---+pp > 1. 

Let G = (V, E) be a simple graph. 

Prove that there exists a k-coloring f of G with the following property: For 
each vertex v € V, at most pfo) deg v neighbors of v have the same color as 
v. 


[Solution: This is Exercise 5 on midterm #1 from my Spring 2017 course; 


see the course page for solutions.] 


7. Independent sets 


7.1. Definition and the Caro—Wei theorem 


Next, we define one of the most fundamental notions in graph theory: 


Definition 7.1.1. An independent set of a multigraph G means a subset S of 
V (G) such that no two elements of S are adjacent. 


In other words, an independent set of G means an induced subgraph of G 
that has no edged, Note that “no two elements of S” doesn’t mean “no two 
distinct elements of S”. 


This is a somewhat sloppy statement. Of course, an independent set is not literally an in- 
duced subgraph, since the former is just a set, while the latter is a graph. What I mean is 
that a subset S of V (G) is independent if and only if the induced subgraph G [S] has no 
edges. 
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Thus, for example, what we called an “anti-triangle” (back in Definition |2.3.2) 
is an independent set of size 3. 


Remark 7.1.2. Independent sets are closely related to proper colorings. In- 
deed, let G be a graph, and let k € N. Let f : V —> {1,2,...,k} be a 
k-coloring. For each i € {1,2,...,k}, let 


Vi={veV | flo) =i} 
= {all vertices of G that have color i}. 


Then, the k-coloring f is proper if and only if the k sets Vj, V2, ..., Vç are 
independent sets of G. (Proving this is a matter of unraveling the definitions 
of “independent sets” and “proper k-colorings”.) 


One classical computational problem in graph theory is to find a maximum- 
size independent set of a given graph. This problem is NP-hard, so don’t expect 
a quick algorithm or even a good formula for the maximum size of an indepen- 
dent set. However, there are some lower bounds for this maximum size. Here 
is one, known as the Caro—Wei theorem ([AloSpe16} Chapter 6, Probabilistic 
Lens]): 


Theorem 7.1.3 (Caro—Wei theorem). Let G = (V,E, 9) be a loopless multi- 
graph. Then, G has an independent set of size 


1 
2 1+degv 


vEV 


Example 7.1.4. Let G be the following loopless multigraph: 


Then, the degrees of the vertices of G are 3,2,3,2,2,2. Hence, Theorem [7.1.3 
yields that G has an independent set of size 


a ee ee e or 
S143 142" 143 142 142 142 é 7 
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Since the size of an independent set is always an integer, we can round this 
up and conclude that G has an independent set of size > 2. In truth, G actu- 
ally has an independent set of size 3 (namely, {2,4,6}), but there is no way 
to tell this from the degrees of its vertices alone. For example, the vertices of 
the graph 


H:= (4) 
SO 


have the same degrees as those of G, but H has no independent set of size 3. 


We shall give two proofs of Theorem both of them illustrating useful 
techniques 


First proof of Theorem Assume the contrary. Thus, each independent set S 
of G has size 
ISI < $} 


vEV 


1 
1+ degv` (22) 


A V-listing shall mean a list of all vertices in V, with each vertex occurring 
exactly once in the list. If 7 is a V-listing, then we define a subset Jọ of V as 
follows: 


Jo := {v E V | v occurs before all neighbors of v in o}. 


[Example: Let G be the following graph: 


Let c be the V-listing (1,2,7,5,3,4,6). Then, the vertex 1 occurs before all its 
neighbors (2, 4 and 5) in g, and thus we have 1 € Je. Likewise, the vertex 7 
occurs before all its neighbors (3 and 6) in ø, so that we have 7 € Jọ. But the 


*4Note that the looplessness requirement in Theorem is important: If G has a loop at each 
vertex, then the only independent set of G is Ø. 
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vertex 2 does not occur before all its neighbors in 7 (indeed, it occurs after its 
neighbor 1), so that we have 2 ¢ Jọ. Likewise, the vertices 5,3,4,6 don’t belong 
to Jo. Altogether, we thus obtain Je = {1,7}.] 


The set J, is an independent set of G (because if two vertices u and v in Jo 
were adjacent, then u would have to occur before v in g, but v would have to 
occur before u in g; but these two statements clearly contradict each other). 
Thus, (applied to S = Jo) yields 


Vol < L igde ae 


This inequality holds for each V-listing 7. Thus, summing this inequality 
over all V-listings 7, we obtain 


1 


i a ae 
g isa V-listing g is a V-listing vEV 1+ deg Y 
1 
= (# of all V-listings) - } _ ———. (40) 
Xu 1+degv 


On the other hand, I claim the following: 


Claim 1: For each v € V, we have 


ae ee (# of all V-listings) 
(# of all V-listings o satisfying v € Je) > [dese 

[Proof of Claim 1: Fix a vertex v € V. Define deg’ v to be the # of all neighbors 
of v. Clearly, deg’ v < deg v. 

We shall call a V-listing 0 good if the vertex v occurs in it before all its 
neighbors. In other words, a V-listing ø is good if and only if it satisfies v € J, 
(because v € J, means that the vertex v occurs in c@ before all its ons 
Thus, we must show that 


7 (# of all V-listings) 
- ee eae io, 
(# of all good V-listings) > [dese 


We define a map 
T : {all V-listings} — {all good V-listings} 


as follows: Whenever T is a V-listing, we let I (t) be the V-listing obtained 
from Tt by swapping v with the first neighbor of v that occurs in T (or, if T is 
already good, then we just do nothing, i.e., we set I (t) = t). This map T is 
a (1+ deg’ v)-to-1 correspondence — i.e., for each good V-listing ø, there are 


>This follows straight from the definition of Jø. 
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exactly 1 + deg’v many V-listings T that satisfy T(t) = ø (in fact, one of these 
v’s is g itself, and the remaining deg’ v many of these t’s are obtained from 7 
by switching v with some neighbor of v). Hence, by the multijection principld?4 
we conclude that 


|{all V-listings}| = (1 + deg’ v) - |{all good V-listings}| . 
In other words, 
(# of all V-listings) = (1 + deg’ v) - (# of all good V-listings) . 
Hence, 


# of all V-listi -listi 
(st allgosd Vaistaes) = (# of a ringe) > (# of all V-listings) 
1+ deg’ v 1+degv 
(since deg’ v < degv). This proves Claim 1 (since the good V-listings are pre- 
cisely the V-listings o satisfying v € J,).] 


Next, we recall a basic property of the Iverson bracket notation} If T is a 
subset of a finite set S, then 


T|=}ł} peT]. (41) 


ves 


(Indeed, the sum } [v € T] contains an addend equal to 1 for each v € T, 
vES 
and an addend equal to 0 for each v € S\ T. Thus, this sum amounts to 


|T]-1+|S\T|-0=|T].) 


See a footnote in the proof of Theorem|5.10.4]for the statement of the multijection principle. 
See, e.g., Definition [5.14.2|for the definition of the Iverson bracket notation. 
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Now, yields 


1 
# of all V-listings) - — 
( 8s) z2 1+degv 


> È Ji = D 2 [P E Jo] 


7 is a V-listing g is a V-listing vEV 
= vE 
S — 
pa Jo] _ 
oam 5A 


veV visa V-listing 


=) betel 


vEV 7 is a V-listing 


ee 
=(# of all V-listings o satisfying v€ J) 
(because the sum L [ve Jo] 
g is a V-listing 
contains an addend equal to 1 for each V-listing c satisfying v€J,, 
and an addend equal to 0 for each other V-listing 7) 


= }_ (# of all V-listings o satisfying v € Jo) 
ee 


V 
7 (# of all V-listings) 
1+degv 
(by Claim 1) 


(# of all V-listings) A 
> AO Z i 
>) EET (# of all V-listings) - } 


vEV vEV 


a 
1+degv 


This is absurd (since no real number x can satisfy x > x). So we got a contra- 
diction, and our proof of Theorem|Z.1.3]is complete. O 


Remark 7.1.5. This proof is an example of a probabilistic proof. Why? We 
have been manipulating sums, but we could easily replace these sums by 
averages. Claim 1 then would say the following: For any given vertex v € V, 
the probability that a (uniformly random) V-listing ø satisfies v € Jọ is 
> eo Thus, the expectation of |Jc| is > PIT dego 
of expectation). Therefore, at least one V-listing o actually satisfies |J,| > 


(by linearity 


}Ł ———-. So the whole proof can be restated in terms of probabilities 
vev 1+degv 
and expectations. 

Note that this proof (as it stands) is fairly useless as it comes to actually 
finding an independent set of size > X ETET 
better algorithm than “try the subsets J, for all possible V-listings 7; one of 
them will work”, which is even slower than trying all subsets of V. 

Note also that the proof does not entail that at least half of the V-listings 


. It does not give any 


g will satisf > —— The mean is not the median! 
y Jol 2 X& 1+degv 


Let us now give a second proof of the theorem, which does provide a good 
algorithm: 


An introduction to graph theory, version August 2, 2023 page 302 


Second proof of Theorem [Z.1.3] We proceed by strong induction on |V|. Thus, we 
fix p € IN, and we assume (as the induction hypothesis) that Theorem [7.1.3] is 
already proved for all loopless multigraphs G with < p vertices. We must now 
prove it for a loopless multigraph G = (V, E, pọ) with p vertices. 

If |V| = 0, then this is clear (since Ø is an independent set of appropriate 
size). Thus, we WLOG assume that |V| # 0. We furthermore assume WLOG 
that G is a simple graph (because otherwise, we can replace G by G‘i™P; this 
can only decrease the degrees deg v of the vertices v € V, and thus our claim 
only becomes stronger). 

Since |V| # 0, there exists a vertex u € V with deg, u minimu 98, Pick such 
a u. Thus, 

degg v > deggu for each v € V. (42) 


Let U := {u} U {all neighbors of u}. Thus, U C V and |U| = 1 + deg, u (this 
is a honest equality, since G is a simple graph). 

Let G’ be the induced subgraph of G on the set V \ U. This is the simple graph 
obtained from G by removing all vertices belonging to U (that is, removing the 
vertex u along with all its neighbors) and removing all edges that require these 
vertices. Then, G’ has fewer vertices than G. Hence, G’ has < p vertices (since 
G has p vertices). Hence, by the induction hypothesis, Theorem [/.1.3]is already 
proved for G’. In other words, G’ has an independent set of size 


1 


5 
ve 1+ degg v 


Let T be such an independent set. Set S := {u} U T. Then, S is an independent 
set of G (since T C V \ U, so that T contains no neighbors of u). Moreover, I 


claim that |S| > Z na er Indeed, this follows from 
vev 1 + deggv 


3 1 _ D 1 1 
ocv L +deggv ja, 1+deggv veu 1 + deggv 
1 1 
Le —— = E A 
-1+ deggu -1+degg v 
(since degg v>degg u (since degg v>degg v 
(by E) (because G’ is a subgraph of G)) 
1 1 
2 ee a 
z 1+ deggu u 1+ degg v 
—r maam CO 
2 mS = 
1 arb u (since T has size > £ deo 
(since |U|=1+deg, u) oes fea 
<1+|T|= |S] (since S = {u} UT). 


Here, the notation degy u means the degree of a vertex u in a graph H. 


An introduction to graph theory, version August 2, 2023 page 303 


1 
olden | 
So we have found an independent set of G having size > & EECA Te (namely, 


S). This means that Theorem holds for our G. This completes the induc- 
tion step, and Theorem is proved. O 


Remark 7.1.6. The second proof of Theorem [Z.1.3] (unlike the first one) does 
give a fairly efficient algorithm for finding an independent set of the appro- 
priate size. However, the second proof is actually not that much different 
from the first proof; it can in fact be recovered from the first proof by de- 
randomization, specifically using the 
(This is a general technique for “derandomizing” probabilistic proofs, i.e., 
turning them into algorithmic ones. It often requires some ingenuity and is 
not guaranteed to always work, but the above is an example where it can be 
applied. See Chapter 13] for much more about derandomization.) 


See also [Chen14] and [AloSpel16] for more about probabilistic proofs in 
combinatorics and in general. Here are two more applications of probabilis- 
tic proofs: 


Exercise 7.1. Let G = (V,E) be a simple graph such that each vertex of 
G has degree > 1. Prove that there exists a subset S of V having size > 
È, IF dego and with the property that the induced subgraph G [S] is a 
forest. 


[Hint: As the example of OO) shows, this claim is not true 


for loopless multigraphs (unlike the similar Theorem [7.1.3).] 


Exercise 7.2. Let n be a positive integer. Prove that there exists a tournament 


n! 
with n vertices and at least orl Hamiltonian paths. 


7.2. A weaker (but simpler) lower bound 


Let us now weaken Theorem a bit: 


Corollary 7.2.1. Let G be a loopless multigraph with n vertices and m edges. 
Then, G has an independent set of size 


n2 


—~n+2m 


In order to prove this, we will need the following inequality: 
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Lemma 7.2.2. Let 41,42, ..., An be n positive reals. Then, 


lada po n? 
a, ag An ay +a +: Aan 


Proof of Lemma There are several ways to prove thisP?] 


1 
e Apply Jensen’s inequality| to the convex function Rt > Rt, x => x 


e Apply the Cauchy-Schwarz inequality to get 


ag an 


2 
1 

if + io + + 4/ an i =n’. 
ae ee aes 


e Apply the AM-HM inequality 
e Apply the AM-GM inequality twice, then multiply. 


e There is a direct proof, too: First, recall the famous inequality 


1 1 1 
(a1 +ao+++++ ay) a aa 
ay 


which holds for any two positive reals u and v. (This follows by observing 


For unexplained terminology used in the bullet points below, see any textbook on inequal- 
ities, such as [Steele04]. (That said, notation is not completely standardized; what I call 
“AM-HM inequality” is dubbed “HM-AM inequality” in [Steele04].) 
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2 
w 9 OS) Now 
v u uv 


1 
(a, + a2 +---+4n) € 


in the second double sum 


here, we swapped the two 
summation signs in the 
second double sum 


i=1 j=1 “J i i=1 j=l 
— iy ee 
>2 =72.2 
(by (43)) 


from which the claim of Lemma follows. 


O 


Proof of Corollary Write the multigraph G as G = (V,E, 9). Thus, |V| = n 
and |E| = m. We WLOG assume that V = {1,2,...,n} (since |V| = n). Hence, 


}_ degv = )) degv =2. |E| (by Proposition 2.4.3) 
v=1 veV a 
= 2m. 
However, Theorem yields that G has an independent set of size 
1 i 1 
> —— = ) — (since V = {1,2,...,n}) 
2 aces J, 1+degv 
x n? by Lemma applied to the n positive 
i 3 Gkdegt) reals a, = 1 + degv for all v € {1,2,...,n} 
v=1 
: 3 3 
= —— since (1+degv)=n+ ) degu=n+2m 
A+ 2m v=1 veV 
——— 
=2m 


This proves Corollary o 


1 
(since r= (x+ x) for any x € R) 


here, we renamed i and j as j and i 


) 
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7.3. A proof of Turan’s theorem 


Recall Turan’s theorem (Theorem |2.4.8), whose proof we have not given so far. 
Now is the time. For the sake of convenience, let me repeat the statement of 
the theorem: 


Theorem 7.3.1 (Turan’s theorem). Let r be a positive integer. Let G be a 
simple graph with n vertices and e edges. Assume that 


r—1 n2 


e> : 
r 2 


Then, there exist r + 1 distinct vertices of G that are mutually adjacent (i.e., 
any two distinct vertices among these r + 1 vertices are adjacent). 


We can now easily derive it from Corollary 


Proof of Theorem|Z.3.1| Write the simple graph G as G = (V,E). Thus, |V| = n 
and |E| = e and E C Pp (V). 

Let E := P3 (V) \ E. Thus, the set E consists of all “non-edges” of G — that is, 
of all 2-element subsets of V that are not edges of G. Clearly, 


= n 
PI = PN E = (Pa (V)) - [el = (5) -e 
“Ss w 
n =e 
k) 
Now, let G be the simple graph (V,E). This simple graph G is called the 
complementary graph of G; it has n vertices and |E| = (5) —e edges| 109 


Hence, Corollary (applied to G and 6 — e instead of G and m) yields 


that G has an independent set of size 


100For example, if G) , then 


An introduction to graph theory, version August 2, 2023 page 307 


Let S be this independent set. Its size is 


2 2 
a a ” 


= a 
—1)— 2_ 
n+2-((5)-e) n+n(n—1)—2e n*—2¢ 


2 
(this inequality follows by high-school algebra from e > a =). Hence, 
|S| > r+ 1 (since |S| and r are integers). However, S is an independent set 
of G. Thus, any two distinct vertices in S are non-adjacent in G and therefore 
adjacent in G (by the definition of G). Since |S| > r+ 1, we have thus found 
r+ 1 (or more) distinct vertices of G that are mutually adjacent in G. This 
proves Theorem [7.3.1 O 


Several other beautiful proofs of Theorem can be found in [AigZie18 
Chapter 41] and [Zhao23} §1.2]. 


8. Matchings 


8.1. Introduction 


Independent sets of a graph consist of vertices that “have no edges in common” 
(i.e., no two belong to the same edge). 

In a sense, matchings are the dual notion to this: they consist of edges that 
“have no vertices in common” (i.e., no two contain the same vertex). Here is 
the formal definition: 


Definition 8.1.1. Let G = (V, E, pọ) be a loopless multigraph. 


(a) A matching of G means a subset M of E such that no two distinct edges 
in M have a common endpoint. 


(b) If M is a matching of G, then an M-edge shall mean an edge that 
belongs to M. 


(c) If M is a matching of G, and if v € V is any vertex, then we say that 
v is matched in M (or saturated in M) if v is an endpoint of an M- 
edge. In this case, this latter M-edge is necessarily unique (since M is 
a matching), and is called the M-edge of v. The other endpoint of this 
M-edge (i.e., its endpoint different from v) is called the M-partner of v. 


(d) A matching M of G is said to be perfect if each vertex of G is matched 
in M. 


(e) Let A be a subset of V. A matching M of G is said to be A-complete if 
each vertex in A is matched in M. 
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Thus, a matching M of a multigraph G = (V,E, @) is perfect if and only if it 


is V-complete. 


Exercise 8.1. Let G be the following simple graph: 
ra 
QA D 


Then: 


e The set {12, 36, 47} is a matching of G. If we call this set M, then the 
vertices matched in M are 1,2,3,4, 6,7, and their respective M-partners 
are 2,1,6,7,3,4. This matching is not perfect, but it is (for example) 
{1,3,4}-complete and {1, 2,3, 4, 6,7 }-complete. 


e The set {12, 36, 67} is not a matching of G, since the two distinct edges 
36 and 67 from this set have a common endpoint. 


e The sets Ø, {36}, {15, 29, 36, 47} are matchings of G as well. 


We see that any matching “pairs up” some vertices using the existing edges 


of the graph. Clearly, the M-partner of the M-partner of a vertex v is v itself. 
Also, no two distinct vertices have the same M-partner (since otherwise, their 
M-edges would have a common endpoint). 


Remark 8.1.2. A matching of a loopless multigraph G = (V, E, ọ) can also 
be characterized as a subset M of its edge set E such that all vertices of the 
spanning subgraph (V, M, ọ |m) have degree < 1. 


Warning 8.1.3. If a multigraph G has loops, then most authors additionally 
require that a matching must not contain any loops. This ensures that Re- 
mark remains valid. 


Here are some natural questions: 
e Does a given graph G have a perfect matching? 
e If not, can we find a maximum-size matching? 
e What about an A-complete matching for a given A C V ? 


Here are some examples: 
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Example 8.1.4. Let n and m be two positive integers. The Cartesian product 
Py X Pm of the n-th path graph P, and the m-th path graph P, is known as 
the (n,m)-grid graph, as it looks as follows: 


(a) If n is even, then 


{{(i,j), @+1,j)} | iis odd, while j is arbitrary} 


is a perfect matching of Pa, x Pm. For example, here is this perfect 
matching for n = 4 and m = 3 (we have drawn all edges that do not 
belong to this matching as dotted lines): 


(b) Likewise, if m is even, then 
t(j), (Lj +1)} | jis odd, while i is arbitrary} 


is a perfect matching of P, x Pin. 
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(c) If n and m are both odd, then P„ x Pm has no perfect matching. Indeed, 
any loopless multigraph G with an odd number of vertices cannot have 
a perfect matching, since each edge of the matching covers exactly 2 
vertices. 


Example 8.1.5. The “pentagon with two antlers” C¥ (this is my notation, 
hopefully sufficiently natural) is the following graph: 


It has no perfect matching. This is easiest to see as follows: The graph CZ 
is loopless, so each edge contains exactly two vertices. Thus, any matching 
M of Cf matches exactly 2-|M| vertices. In particular, any matching of Cf 
matches an even number of vertices. Since the total number of vertices CZ is 
odd, this entails that CY has no perfect matching. 

What is the maximum size of a matching of C/ ? The matching {12, 34} of 
Cz has size 2 and cannot be improved by adding any new edges. Thus, one 
is tempted to believe that the maximum size of a matching is 2. However, 
this is not the case. Indeed, the matching {12, 37, 45} has size 3. This latter 
matching is actually maximum-size. 


Example[8.1.5|shows that when searching for a maximum-size matching, it is 
not sufficient to just keep adding edges until no further edges can be added; this 
strategy may lead to a non-improvable but non-maximum matching. This sug- 
gests that finding a maximum-size matching may be one of those hard problems 
like finding a maximum-size independent set. But no — there is a polynomial- 
time algorithm! It’s known as the and it has a 
running time of O G . Iv?) ; however, it is too complicated to be covered in 
this course. We shall here focus on a simple case of the problem that is already 
interesting enough and almost as useful as the general case. 

Namely, we shall study matchings of bipartite graphs. 


An introduction to graph theory, version August 2, 2023 page 311 


8.2. Bipartite graphs 
Definition 8.2.1. A bipartite graph means a triple (G, X, Y), where 
e G=(V,E,@) isa multigraph, and 


e X and Y are two disjoint subsets of V such that XUY = V and such 
that each edge of G has one endpoint in X and one endpoint in Y. 


Example 8.2.2. Consider the 6-th cycle graph C¢: 
ey G 
G? D 
Ommo 


Then, (Co, {1,3,5}, {2,4,6}) is a bipartite graph, since each edge of 
G has one endpoint in {1,3,5} and one endpoint in {2,4,6}. Also, 
(Ce, {2,4,6}, {1,3,5}) is a bipartite graph. 

Note that a bipartite graph (G, X,Y) is not just the graph G but rather 
the whole package consisting of the graph G and the subsets X and Y. 
Two different bipartite graphs can have the same underlying graph G 
but different choices of X and Y. For example, the two bipartite graphs 
(Ce, {1,3,5}, {2,4,6}) and (Ce, {2,4,6}, {1,3,5}) are different. 

We typically draw a bipartite graph (G, X,Y) by drawing the graph G in 
such a way that the vertices in X are aligned along one vertical line and the 
vertices Y are aligned along another, with the former line being left of the 
latter. Thus, for example, the bipartite graph (C6, {1,3,5}, {2,4,6}) can be 


drawn as follows: T 3 
\/ 
CAD 
can 
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Similarly, the bipartite graph (Ce, {2,4,6}, {1,3,5}) can be drawn as fol- 


lows: 


This example suggests the following terminology: 


Definition 8.2.3. Let (G, X,Y) be a bipartite graph. We shall refer to the 
vertices in X as the left vertices of this bipartite graph. We shall refer to the 
vertices in Y as the right vertices of this bipartite graph. Moreover, the edges 
of G will be called the edges of this bipartite graph. 


Thus, each edge of a bipartite graph joins one left vertex with one right vertex. 


Bipartite graphs are “the same as” multigraphs with a proper 2-coloring. To 
wit: 


Proposition 8.2.4. Let G = (V,E, ọ) be a multigraph. 


(a) If (G,X,Y) is a bipartite graph, then the map 


f:V > {1,2}, 
1, aoe xX; 
V => 
2, ifveY 


is a proper 2-coloring of G. 


(b) Conversely, if f : V — {1,2} is a proper 2-coloring of G, then (G, Vj, V2) 
is a bipartite graph, where we set 


V; := {all vertices with color i} for each i € {1,2}. 


(c) These constructions are mutually inverse. (That is, going from a bipar- 
tite graph to a proper 2-coloring and back again results in the original 
bipartite graph, whereas going from a proper 2-coloring to a bipartite 
graph and back again results in the original 2-coloring.) 
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Proof. An exercise in understanding the definitions. o 


Proposition 8.2.5. Let (G, X, Y) be a bipartite graph. Then, the graph G has 
no circuits of odd length. In particular, G has no loops or triangles. 


Proof. By Proposition |8.2.4] (a), we know that G has a proper 2-coloring. Hence, 
the 2-coloring equivalence theorem (Theorem[6.2.1) shows that G has no circuits 
of odd length. In particular, G has no loops or triangles (since these would yield 
circuits of length 1 or 3, respectively). oO 


We need another piece of notation: 


Definition 8.2.6. Let G = (V, E, ọ) be any multigraph. Let U be a subset of 
V. Then, 
N (U) := {v € V | v has a neighbor in U}. 


This is called the neighbor set of U. 


Example 8.2.7. If G is the “pentagon with antlers” C¥ from Example 
then 


N ({1,5,6}) = {1,2,4,5}; 


N ({1}) = {2,5}; 
N (Ø) = Ø. 


For bipartite graphs, the neighbor set has a nice property: 
Proposition 8.2.8. Let (G, X,Y) be a bipartite graph. Let A C X. Then, 


N(A)CY. 


Proof. Let v € N (A). Thus, the vertex v has a neighbor in A (by definition of 
N(A)). Let w be this neighbor. Then, w € A C X, so that w ¢ Y (since the 
bipartiteness of (G, X, Y) shows that the sets X and Y are disjoint). 

There exists some edge that has endpoints v and w (since w is a neighbor of 
v). This edge must have an endpoint in Y (since the bipartiteness of (G, X,Y) 
shows that each edge of G has one endpoint in Y). In other words, one of v and 
w must belong to Y (since the endpoints of this edge are v and w). Since w ¢ Y, 
we thus conclude that v € Y. 

Thus, we have shown that v € Y for each v € N (A). In other words, N (A) C 
Y. O 
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Exercise 8.2. Let (G, X,Y) be a bipartite graph. Prove that 


E (-1)'[N (A) =Y] = © (1)! [N (B) = X] 


ACX BCY 


(where we are using the Iverson bracket notation). 


8.3. Hall’s marriage theorem 


How can we tell whether a bipartite graph has a perfect matching? an X- 
complete matching? First, to keep the suspense, let us prove some trivialities: 


Proposition 8.3.1. Let (G, X, Y) be a bipartite graph. Let M be a matching of 
G. Then: 


(a) The M-partner of a vertex x € X (if it exists) belongs to Y. 
The M-partner of a vertex y € Y (if it exists) belongs to X. 


(b) We have |M| < |X| and |M| < |Y]. 

(c) If M is X-complete, then |X| < |Y]. 

(d) If M is perfect, then |X| = |Y]. 

(e) If |M| > |X|, then M is X-complete. 

(f) If M is X-complete and we have |X| = |Y|, then M is perfect. 


Proof. Each edge of G has an endpoint in X and an endpoint in Y (since 
(G, X,Y) is a bipartite graph). Thus, in particular, each M-edge has an end- 
point in X and an endpoint in Y. Moreover, no two M-edges share a common 
endpoint (since M is a matching). 


(a) This follows from the fact that each M-edge has an endpoint in X and an 
endpoint in Y. 


(b) Recall that each M-edge has an endpoint in X. Since no two M-edges 
share a common endpoint, we thus have found at least |M| many endpoints in 
X. This entails |M| < |X|. Similarly, |M| < |Y]. 


(c) Assume that M is X-complete. Hence, each vertex in X is matched in M 
and therefore has an M-edge that contains it. In other words, for each vertex 
x € X, there exists an M-edge m such that x is an endpoint of m. Since no 
two M-edges share an endpoint, this yields that there are at least |X| many 
M-edges. In other words, |M| > |X|. Hence, |X| < |M] < |Y| (by part (b)). 
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(d) Assume that M is perfect. Then, M is both X-complete and Y-complete. 
Hence, part (c) yields |X| < |Y|; similarly, |Y| < |X|. Combining these two 
inequalities, we obtain |X| = |Y]. 


(e) Assume that |M| > |X]. 

However, each M-edge has an endpoint in X. These endpoints are all distinct 
(since no two M-edges share a common endpoint), and there are at least |X| 
many of them (since there are |M| many of them, but we have |M| > |X)). 
Therefore, these endpoints must cover all the vertices in X (because the only 
way to choose |X| many distinct vertices in X is to choose all vertices in X). In 
other words, all the vertices in X must be matched in M. In other words, the 
matching M is X-complete. 


(£) Assume that M is X-complete and that we have |X| = |Y]. 

The matching M is X-complete; thus, all vertices x € X are matched in M. 
The M-partners of all these vertices x € X belong to Y (by Proposition [8.3.1] 
(a)), and are also matched in M. Hence, at least |X| many vertices in Y must 
be matched in M (since these M-partners are all distinc), In other words, 
at least |Y| many vertices in Y must be matched in M (since |X| = |Y|). This 
means that all vertices in Y are matched in M (since “at least |Y| many vertices 
in Y” means “all vertices in Y”). Since we also know that all vertices x € X are 
matched in M, we thus conclude that all vertices of G are matched in M. In 
other words, the matching M is perfect. O 


Example 8.3.2. Consider the bipartite graph 


(drawn as explained in Example|8.2.2). Does this graph have a perfect match- 
ing? No, because the two left vertices 1 and 3 would necessarily have the 
same partner in such a matching (since their only possible partner is 2). 


10lbecause the M-partners of distinct vertices are distinct 
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Similarly, the bipartite graph 


has no perfect matching, since the three left vertices 1, 5 and 7 have only two 
potential partners (viz., 2 and 6). 


So we see that a subset A C X satisfying |N (A)| < |A| is an obstruction to 
the existence of an X-complete matching. Let us state this in a positive way: 


Proposition 8.3.3. Let (G, X,Y) be a bipartite graph. Let A be a subset of X. 
Assume that G has an X-complete matching. Then, |N (A)| > |A]. 


Proof. Let V be the vertex set of G. We assumed that G has an X-complete 
matching. Let M be such a matching. Thus, each x € X has an M-partner. The 
map 


p:X >V, 
x ++ (the M-partner of x) 


is injective (since two distinct vertices cannot have the same M-partner). Thus, 
|p (A)| = |A| (because any injective map preserves the size of a subset). How- 
ever, p(A) C N(A), because the M-partner of an element of A will always 
belong to N (A). Hence, |p (A)| < IN (A)|. Thus, |N (A)| > |p(A)| = IA], 
qed. O 


So we have found a necessary condition for the existence of an X-complete 
matching. Interestingly, it is also sufficient: 


Theorem 8.3.4 (Hall’s marriage theorem, short: HMT). Let (G, X,Y) be a 
bipartite graph. Assume that each subset A of X satisfies |N (A)| > |A]. 
(This assumption is called the “Hall condition” .) 

Then, G has an X-complete matching. 
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This is called “marriage theorem” because one can interpret a bipartite graph 
as a dating scene, with X being the guys and Y the ladies. A guy x and a 
lady y are adjacent if and only if they are interested in one another. Thus, an 
X-complete matching is a way of marrying off each guy to some lady he is 
mutually interested in (without allowing polygamy). This is a classical model 
for bipartite graphs and appears all across the combinatorics literature; to my 
knowledge, however, no real-life applications have been found along these 
lines. Nevertheless, Hall’s marriage theorem can be applied in many other sit- 
uations, such as logistics (although its generalizations, which we will soon see, 
are even more useful in that). Philip Hall has originally invented the theorem 
in 1935 (in a somewhat obfuscated form), motivated (I believe) by a problem 
about finite groups. So did Wilhelm Maak, also in 1935, for use in analysis 
(defining a notion of integrals for almost-periodic functions). 

There are many proofs of Hall’s marriage theorem, some pretty easy. Two 
short and self-contained proofs can be found in [LeLeMel18) §12.5.2] and in 
|Harju14, Theorem 3.9]. I will tease you by leaving the theorem unproved for 
several pages, while exploring some of its many consequences. Afterwards, I 
will give two proofs of Hall’s marriage theorem: 


e one proof using the theory of network flows (Section 9.5) — an elegant 
theory created for use in logisticd! in the 1950s that has proved to be 
quite useful in combinatorics. Among other consequences, this proof will 
also provide a polynomial-time algorithm for actually finding a maximum 
matching in a bipartite graph (Theorem[8.3.4|by itself does not help here). 


e another proof using the Gallai-Milgram theorem (Subsection [L0.2.3) — an 
elegant and surprising property of paths in digraphs. 


8.4. König and Hall—Konig 


Hall’s marriage theorem is famous for its many forms and versions, most of 
which are “secretly” equivalent to it (i.e., can be derived from it and conversely 
can be used to derive it without too much trouble). We will start with one that 
is known as K@6nig’s theorem (discovered independently by Dénes Kőnig and 
Jenő Egervary in 1931). This relies on the notion of a vertex cover. Here is its 
definition: 


Definition 8.4.1. Let G = (V,E,qg) be a multigraph. A vertex cover of G 
means a subset C of V such that each edge of G contains at least one vertex 
in C. 


102and, more generally, operations research 
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Example 8.4.2. Let n > 1. What are the vertex covers of the complete graph 
Ky? 

A quick thought reveals that any subset S of {1,2,...,n} that has at least 
n — 1 elements is a vertex cover of K,. (In fact, Ky, has no loops, so that 
each edge of Kn contains two different vertices, and thus at least one of these 
two vertices belongs to S.) On the other hand, a subset S with fewer than 
n — 1 vertices will never be a vertex cover of Kn (since there will be at least 
two distinct vertices that don’t belong to S, and the edge that joins these two 
vertices contains no vertex in S). 


Example 8.4.3. Let G = (V,E,q@) be the graph from Example Then, 
the set {2,5} is a vertex cover of G. Of course, any subset of V that contains 
{2,5} as a subset will thus also be a vertex cover of G. 


Note that the notion of a vertex cover is (in some sense) “dual” to the notion 
of an edge cover, which we defined in Exercise For those getting confused, 


here is a convenient table (courtesy of Nadia Lafreniére, Math 38, Spring 2021): 


independent set vertices at most one vertex per edge 
vertex cover vertices at least one vertex per edge 


The notion of vertex covers is also somewhat reminiscent of the notion of 
dominating sets; here is the precise relation: 


Remark 8.4.4. Each vertex cover of a multigraph G is a dominating set (as 
long as G has no vertices of degree 0). But the converse is not true. 


Proposition 8.4.5. Let G be a loopless multigraph. 
Let m be the largest size of a matching of G. 
Let c be the smallest size of a vertex cover of G. 
Then, m < c. 


Proof. By the definition of m, we know that G has a matching M of size m. 
By the definition of c, we know that G has a vertex cover C of size c. 
Consider these M and C. Every M-edge e € M contains at least one vertex 
in C (since C is a vertex cover). Thus, we can define a map f : M — C that 
sends each M-edge e to some vertex in C that is contained in e. (If there are two 
such vertices, then we just pick one of them at random.) This map f is injective, 
because no two M-edges contain the same vertex (after all, M is a matching). 
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Thus, we have found an injective map from M to C (namely, f). Therefore, 
|M| < |C|. But the definitions of M and C show that |M| = m and |C| = c. 
Thus, m = |M| < |C| =c, and Proposition is proved. O 


In general, we can have m < c in Proposition However, for a bipartite 
graph, equality reigns: 


Theorem 8.4.6 (KGnig’s theorem). Let (G, X, Y) be a bipartite graph. 
Let m be the largest size of a matching of G. 
Let c be the smallest size of a vertex cover of G. 
Then, m = c. 


Both Hall’s and König’s theorems easily follow from the following theorem: 


Theorem 8.4.7 (Hall-König matching theorem). Let (G, X,Y) be a bipartite 
graph. Then, there exist a matching M of G and a subset U of X such that 


|M] > |N (U)| + |X| — |u]. 


We will prove this theorem in Section and again in Subsection [10.2.3 
For now, let us show that Hall’s marriage theorem (Theorem [8.3.4), König’s 
theorem (Theorem |8.4.6) and the Hall-König matching theorem (Theorem|8.4.7) 
are mutually equivalent. More precisely, we will explain how to derive the first 
two from the third, and outline the reverse derivations. 


Proof of Theorem using Theorem Assume that Theorem|8.4.7|has already 
been proved. 

Theorem yields that there exist a matching M of G and a subset U of X 
such that 

|M] > IN (U)| + |X| — lul. 
Consider these M and U. The Hall condition shows that each subset A of X 
satisfies |N (A)| > |A|. Applying this to A = U, we obtain |N (U)| > |U|. 
Thus, 
|M] > |N (U)| + |X|- lu] > [X]. 
— 
>|U| 

Hence, the matching M is X-complete (by Proposition (e)). Thus, we have 
found an X-complete matching. This proves Theorem (assuming that 
Theorem |8.4.7|is true). o 


Proof of Theorem [8.4.6]using Theorem[8.4.7] Assume that Theorem[8.4.7|has already 
been proved. 

Write the multigraph G as G = (V,E,@). Theorem yields that there 
exist a matching M of G and a subset U of X such that 


|M| > |N (U)| + |X| — |U]. (44) 
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Consider these M and U. Clearly, |M| < m (since m is the largest size of a 
matching of G). 

Let C := (X \U) U N (U). This is a subset of V. Moreover, each edge of G 
has at least one endpoint in C (this is easy to sed! 03), Hence, C is a vertex cover 
of G. Therefore, |C| > c (since c is the smallest size of a vertex cover of G). The 
definition of C yields 


IC] = |(X\U)UN(U)| 


< |X\U| +|N(U)| (actually an equality, but we don’t care) 
—— 
=|X|—|U| 
(since UCX) 
= |X| - [U| + IN (U)| = |N (U)| + |X| — [Us |M] (by 48) 
< m. 


Hence, m > |C| > c. Combining this with m < c (which follows from Proposi- 
tion |8.4.5), we obtain m = c. Thus, Theorem follows. O 


Conversely, it is not hard to derive the HKMT from either Hall or König: 


Proof of Theorem [8.4.7] using Theorem [8.3.4] (sketched). Assume that Theorem [8.3.4] has al- 
ready been proved. 

Add a bunch of “dummy vertices” to Y and join each of these “dummy vertices” by 
a new edge to each vertex in X. How many “dummy vertices” should we add? As 
many as it takes to ensure that every subset A of X satisfies the Hall condition — i.e., 
exactly max {|A| — |N (A)| | A is a subset of X} many. 

Let G’ be the resulting graph. Let also D be the set of all dummy vertices that 
were added to Y, and let Y’ = Y U D be the set of all right vertices of G’. (The set 
of left vertices of G’ is still X.) Then, the bipartite graph (G’, X,Y’) satisfies the Hall 
condition, and therefore we can apply Theorem[8.3.4]to (G’, X, Y’) instead of (G, X,Y), 
and conclude that the graph G’ has an X-complete matching. Let M’ be this matching. 
By removing from M’ all edges that contain dummy vertices, we obtain a matching M 


103 Proof. Let e be an edge of G. We must show that e has at least one endpoint in C. 
Clearly, the edge e has an endpoint in X (since (G, X,Y) is a bipartite graph). Let x be 
this endpoint. This x either belongs to U or doesn’t. 


e If x belongs to U, then the other endpoint of e (that is, the endpoint distinct from 
x) belongs to N (U) (since its neighbor x belongs to U) and therefore to C (since 
N(U) C (X\U)UN(U) =C). 


e If x does not belong to U, then x belongs to X \ U (since x € X) and therefore to C 
(since X \U C (X\U) UN (U) =C). 


In either of these two cases, we have found an endpoint of e that belongs to C. Thus, e 
has at least one endpoint in C, ged. 


An introduction to graph theory, version August 2, 2023 page 321 


of G. This matching M has size 


|M| = |M’| — (the number of edges that were removed from M’) 
a 


<(the number of dummy vertices) 
(since each dummy vertex is contained in at most one M'-edge) 


> |M'| — (the number of dummy vertices) 
—_— 


=max{|A|—|N(A)| | A is a subset of X} 
(by the construction of the dummy vertices) 


= |M'| — max {|A| — |N (A)| | A is a subset of X}. (45) 


However, the maximum of a set is always an element of this set. Hence, there exists 
a subset U of X such that 


max {|A| -|N (A)| | A is a subset of X} = |U|—|N(U)|. 


Consider this U. Then, becomes 


|M| > |M'| — max {|A| — |N (A)| | A is a subset of X} 
~r a N 
|X| =|U|—|N(U)| 


(since M’ is X-complete, 
and thus each xe X has 
an M'-edge (and these 

edges are distinct)) 


> |X|- (Ul -= IN (U)|) = IN (U)| + |X| — [UY]. 


Hence, we have found a matching M of G and a subset U of X such that |M| > 
IN (U)| + |X| — |U|. This proves Theorem (assuming that Theorem [8.3.4]is true). 
O 


Proof of Theorem from Theorem [8.4.6] (sketched). Assume that Theorem has al- 
ready been proved. 

Let M be a maximum-size matching of G. Let C be a minimum-size vertex cover of 
G. Then, Theorem[8.4.6] says that |M| = |C]. 

Let U := X \ C. Then, N (U) C C \ X (why?). Hence, |N (U)| < |C \ X|, so that 


IN (U)| + |X| — | UL] < [C\X| + |X] -IX \c| = [C\ X] + (CN X| = |C| = |M]. 
— nA 
<|C\X| aa =|CnX| 


Hence, |M| > |N (U)| + |X| — |U|. This proves Theorem[8.4.7] (assuming that Theorem 
is true). oO 


Theorem thus occupies a convenient “high ground” between the Hall 
and König theorems, allowing easy access to both of them. We shall prove 
Theorem in Section 9.5]and again in Subsection [10.2.3 
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8.5. Systems of representatives 


There are two more equivalent form of the HMT that have the “advantage” that 
they do not rely on the notion of a graph. When non-combinatorialists use the 
HMT, they often use it in one of these forms. Here is the first form: 


Theorem 8.5.1 (existence of SDR). Let A4, A2,..., An be any n sets. Assume 
that the union of any p of these sets has size > p, for all p € {0,1,...,n}. (In 
other words, assume that 


Aj, U Ap Ute U Agl = p for any 1 <i <i << yp <n. 


) 


Then, we can find n distinct elements 


a, € At, a E€ Ad, abe y an € An. 


Remark 8.5.2. An n-tuple (a1,42,...,an) of n distinct elements like this is 
called a system of distinct representatives for our n sets A1, Az,..., An. (This 
is often abbreviated “SDR”.) 


Example 8.5.3. Take a standard deck of cards; and deal them out into 13 piles 


of 4 cards each — e.g., as follows: 


(2h, 20, 9, KO}, {Ad, AV, 3@, 30}, [AS, 4%, 5%, Ode}, 
{20, 49, 50, 5a}, Ade, 7%, 7h, 70}, [4A, 60, 60, 6h}, 
{30, 3%, 8@, 80}, {2%, Kf, KY, 109}, {40, 5%, 9@, 99}, 
{Qh, QV, QO, Qh}, 160, JA, JO, J}, {7, 80, 8%, 9%}, 
{10@, JỌ, 10%, 10%} 


(you can distribute the cards among the piles randomly; this is just one ex- 
ample). Then, I claim that it is possible to select exactly 1 card from each pile 
so that the 13 selected cards contain exactly 1 card of each rank (i.e., exactly 
one ace, exactly one 2, exactly one 3, and so on). 

Indeed, this follows from Theorem (applied to A; = 
{ranks that occur in the i-th pile}) because any p piles contain cards of 
at least p different ranks. 


Proof of Theorem First, we WLOG assume that all n sets A4, A2, ..., An are 
finite. (If not, then we can just replace each infinite one by an n-element subset 


thereof. The assumption |A; U A} U ++: U A; | 2 p will not be disturbed by 
this change — make sure you understand why!) 
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Furthermore, we WLOG assume that no integer belongs to any of the n sets 
A1, A2,..., An (otherwise, we just rename the elements of these sets so that they 
aren’t integers any more). 

Now, let X = {1,2,...,n} and Y = A1 U A2 U- - - U An. Both sets X and Y are 
finite, and are disjoint. 

We define a simple graph G as follows: 


e The vertices of G are the elements of XUY. 


e A vertex x € X is adjacent to a vertex y € Y if and only if y € Ax. There 
are no further adjacencies. 


Thus, (G, X,Y) is a bipartite graph. The assumption |A YAp ene UA S 
p ensures that it satisfies the Hall condition. Hence, by the HMT (Theorem 
8.3.4), we conclude that this graph G has an X-complete matching. This match- 


ing must have the form 


{{1,a1}, {2,a2}, ..., {n,an}} 


for some 41,a2,...,4n € Y (since (G, X,Y) is bipartite, so that the partners of 
the vertices 1,2,...,n E€ X must belong to Y). These elements 41, a2,...,dn € Y 
are distinct (since two edges in a matching cannot have a common endpoint), 
and eachi € {1,2,...,n} satisfies a; € A; (since the vertex a; is adjacent to i in 
G). Thus, these a1,42,...,an are precisely the n distinct elements we are looking 
for. This proves Theorem |8.5.1 O 


Conversely, it is not hard to derive the HMT from Theorem |8.5.1} Thus, 
Theorem[8.5.1|is an equivalent version of the HMT. It is Theorem that Hall 
originally discovered (|Hall35; Theorem 1]). 


Here is the second set-theoretical restatement of the HMT: 


Theorem 8.5.4 (existence of SCR). Let A1, A2,..., An be n sets. Let 
By, Bo,...,Bm be m sets. Assume that for any numbers 1 < 1; < i2 <--- < 
ip < n, there exist at least p elements j € {1,2,...,m} such that the union 
Aj WA, UU Ai, has nonempty intersection with B;. Then, there exists an 
injective map ø : {1,2,...,n} — {1,2,...,m} such that alli € {1,2,...,n} 
satisfy A; N Ba) # Ø. 


Proof. We leave this to the reader. Again, construct an appropriate bipartite 
graph and apply the HMT. O 


(The “SCR” in the name of the theorem is short for “system of common 
representatives” .) 
See [|[MirPer66] for much more about systems of representatives. 
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8.6. Regular bipartite graphs 


The HMT gives a necessary and sufficient criterion for the existence of an X- 
complete matching in an arbitrary bipartite graph. In the more restrictive set- 
ting of regular bipartite graphs — i.e., bipartite graphs where each vertex has the 
same degree —, there is a simpler sufficient condition: such a matching always 
exists! We shall soon prove this surprising fact (which is not hard using the 
HMT), but first let us get the definition in order: 


Definition 8.6.1. Let k € IN. A multigraph G is said to be k-regular if all its 
vertices have degree k. 


Example 8.6.2. A 1-regular graph is a graph whose entire edge set is a perfect 


matching. In other words, a 1-regular graph is a graph that is a disjoint union 
of copies of the 2-nd path graph Pz. Here is an example of such a graph: 


Example 8.6.3. A 2-regular graph is a graph that is a disjoint union of cycle 
graphs. Here is an example of such a graph: 


(yes, a C4 is fine, and so would be a C2). 


Example 8.6.4. The 3-regular graphs are known as cubic graphs or trivalent 
graphs. An example is the Petersen graph (defined in Subsection |2.6.3). Here 
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is another example (known as the Frucht graph): 


More examples of cubic graphs can be found on the Wikipedia page, There 


is no hope of describing them all. 


Recall the Kneser graphs defined in Subsection They are all regular: 
. /|S|—k 
Example 8.6.5. Any Kneser graph Ks is k -regular. 


Proof. This is saying that if A is a k-element subset of a finite set S, then there 


[S| —k 
k 


are precisely many k-element subsets of S that are disjoint from A. 


But this is clear, since the latter subsets are just the k-element subsets of the 
(|S| — k)-element set S \ A. oO 


Proposition 8.6.6. Let k > 0. Let (G, X,Y) be a k-regular bipartite graph (i.e., 
a bipartite graph such that G is k-regular). Then, |X| = |Y]. 


Proof. Write the multigraph G as G = (V,E,ọ). Each edge e € E contains 
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exactly one vertex x € X (since (G, X,Y) is a bipartite graph). Hence, 


|E| = }_ (# of edges that contain the vertex x) = } deg x 
xEX Se xEX =v, 
=deg x Pes 
(since G is k-regular) 
=) k=ke|X|. 
xEX 


Similarly, |E| = k-|Y|. Comparing these two equalities, we obtain k- |X| = 
k-|Y|. Since k > 0, we can divide this by k, and conclude |X| = |Y]. oO 


Theorem 8.6.7 (Frobenius matching theorem). Let k > 0. Let (G,X,Y) be 
a k-regular bipartite graph (i.e., a bipartite graph such that G is k-regular). 
Then, G has a perfect matching. 


Proof. First, we claim that each subset A of X satisfies |N (A)| > |A]. 

Indeed, let A be a subset of X. Consider the edges of G that have at least one 
endpoint in A. We shall call such edges “A-edges”. How many A-edges are 
there? 

On the one hand, each A-edge contains exactly one vertex in A (why $). 
Thus, 


(# of A-edges) = )_ (# of A-edges containing the vertex x) 
xEA M 
=deg x 
(since each edge that contains the vertex x 
is an A-edge) 


— deg x =) k=k-|Al. 


xeA beag xEA 
(since G is k-regular) 


On the other hand, each A-edge contains exactly one vertex in N (A) (why J09). 
Thus, 


(# of A-edges)= }, (# of A-edges containing the vertex y) 
EER AA 


yEN(A) meam 
= i degy = 9 k=k-|N(A)|. 
yeN(A) ~ yEN(A) 


(since G is k-regular) 


Hence, 
k- |N (A)| > (# of A-edges) = k- |A]. 


Since k > 0, we can divide this inequality by k, and thus find |N (A)| > |A]. 


10Here we are using the fact that A C X, so that no two vertices in A can be adjacent. 
105Here we are using the fact that N (A) C Y (which follows from A C X using Proposition 
8.2.8), so that no two vertices in N (A) can be adjacent. 
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Forget that we fixed A. We thus have proved |N (A)| > |A] for each subset 
A of X. Hence, the HMT (Theorem yields that the graph G has an X- 
complete matching M. Consider this M. 

However, Proposition yields |X| = |Y|. Hence, Proposition (f) 
shows that the matching M is perfect (since M is X-complete). Therefore, G 
has a perfect matching. This proves Theorem oO 


8.7. Latin squares 


One of many applications of Theorem [8.6.7] is to the study of Latin squares. 
Here is the definition of this concept: 


Definition 8.7.1. Let n € IN. A Latin square of order n is an n x n-matrix M 
that satisfies the following conditions: 


1. The entries of M are the numbers 1,2,...,n, each appearing exactly n 
times. 


2. In each row of M, the entries are distinct. 


3. In each column of M, the entries are distinct. 


Example 8.7.2. Here is a Latin square of order 5: 


oF WN FR 
POF WN 
Nr OTe Ww 
U Nee 
AUNE 


Similarly, for each n € N, the matrix (ci+j—1) , where 


1<i<n, 1<j<n 


is a Latin square of order n. 


A popular example of Latin squares of order 9 are Sudokus (but they have to 
satisfy an additional requirement, concerning certain 3 x 3 subsquares). See the 


Wikipedia page and the book [LayMul98] for much more about Latin squares. 


The Latin squares in Example B.7.2]are rather boring. What would be a good 
algorithm to construct general Latin squares? 

Here is an attempt at a recursive algorithm: We just start by filling in the first 
row, then the second row, then the third row, and so on, making sure at each 
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step that the distinctness conditions (Conditions 2 and 3 in Definition |8.7.1) are 
satisfied. 


Example 8.7.3. Let us construct a Latin square of order 5 by this algorithm. 
We begin (e.g.) with the first row 


Cae Sage 


Then, we append a second row (2 4 1 5 3 ) to it, chosen in such a way 
that its five entries are distinct and also each entry is distinct from the entry 
above (again, there are many possibilities; we have just picked one). Thus, 
we have our first two rows: 


We continue along the same lines, ending up with the Latin square 


31425 
24153 
15234 
52341 
4 3512 


(or another, depending on the choices we have made). 


Does this algorithm always work? 

To be fully honest, it’s not a fully specified algorithm, since I haven’t ex- 
plained how to fill a row (it’s not straightforward). But let’s assume that we 
know how to do this, if it is at all possible. The natural question is: Will we 
always be able to produce a complete Latin square using this algorithm, or will 
we get stuck somewhere (having constructed k rows for some k < n, but being 
unable to produce a (k + 1)-st row)? 

It turns out that we won’t get stuck this way. In other words, the following 
holds: 


Proposition 8.7.4. Let n € N and k € {0,1,...,n— 1}. Then, any k x n 
Latin rectangle (i.e., any k x n-matrix that contains the entries 1,2,...,n, 
each appearing exactly k times, and satisfies the Conditions 2 and 3 from 
Definition 8.7.1) can be extended to a (k+ 1) x n Latin rectangle by adding 
an appropriately chosen extra row at the bottom. 


Proof. Let M be a k x n Latin rectangld!04 We want to find a new row that we 


3 142 5 
106For example, ifn = 5 and k = 3, then M canbe | 2 4 1 5 3 }. 
15234 
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can append to M at the bottom, such that the result will be a (k +1) x n Latin 
rectangle. 

This new row should contain the numbers 1,2,...,n in some order. More- 
over, for each i € {1,2,...,n}, its i-th entry should be distinct from all entries 
of the i-th column of M. How do we find such a new row? 

Let X = {1,2,...,n} and Y = {-1,—2,...,—n}. 

Let G be the simple graph with vertex set X U Y, where we let a vertex i € X 
be adjacent to a vertex —j € Y if and only if the number j does not appear in 
the i-th column of M. There should be no further adjacencies. 

Thus, (G, X,Y) is a bipartite graph. Moreover, the graph G is (n — k)-regular 
(this is not hard to ed, Thus, by the Frobenius matching theorem (Theorem 
(8.6.7), the graph G has a perfect matching. Let 


{{1, —aı}, {2, —a2}, ..., {n, —an}} 


be this perfect matching. Then, the numbers 41, a2,. . ., An are distinct (since two 
edges in a matching cannot have a common endpoint), and the number a; does 
not appear in the i-th column of M (since {i, —a;} is an edge of G). Thus, we 
can append the row 

( a, A? ++: An ) 


to M at the bottom and obtain a (k + 1) x n Latin rectangle. This proves Propo- 
sition [8.7.4 oO 


Proposition |8.7.4]is a result of Marshall Hall (no relation to Philip Hall) from 
1945 (see [Hall45]), and the proof given above is exactly his. 


8.8. Magic matrices and the Birkhoff—-von Neumann theorem 


Let us now apply the HMT to linear algebra. 
Recall that N = {0,1,2,...}. We also set R+ := {all nonnegative reals}. 
Here are three very similar definitions: 


Definition 8.8.1. An IN-magic matrix means an n x n-matrix M that satisfies 
the following three conditions: 


1. All entries of M are nonnegative integers. 


107 Proof. Each vertex i € X has degree n — k (after all, there are k numbers in {1,2,...,n} that 
appear in the i-th column of M, thus n — k numbers in {1,2,...,n} that do not appear in 
this column). It remains to show that each vertex —j € Y has degree n — k as well. To see 
this, consider some vertex —j € Y. Then, the number j appears exactly once in each row 
of M (since Condition 2 forces each row to contain the numbers 1,2,...,n in some order). 
Hence, the number j appears a total of k times in M. These k appearances of j must be 
in k distinct columns (since having two of them in the same column would conflict with 
Condition 3). Thus, there are k columns of M that contain j, and therefore n — k columns 
that don’t. In other words, the vertex —j € Y has degree n — k. 
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2. The sum of the entries in each row of M is equal. 


3. The sum of the entries in each column of M is equal. 


Definition 8.8.2. An IR.-magic matrix means an n x n-matrix M that satisfies 
the following three conditions: 


1. All entries of M are nonnegative reals. 
2. The sum of the entries in each row of M is equal. 


3. The sum of the entries in each column of M is equal. 


Definition 8.8.3. A doubly stochastic matrix means an n x n-matrix M that 
satisfies the following three conditions: 


1. All entries of M are nonnegative reals. 
2. The sum of the entries in each row of M is 1. 


3. The sum of the entries in each column of M is 1. 


Clearly, these three concepts are closely related (in particular, all INN-magic 
matrices and all doubly stochastic matrices are R+-magic). The most impor- 
tant of them is the last; in particular, majorization theory (one of the main 
methods for proving inequalities) is deeply connected to the properties of dou- 
bly stochastic matrices (see Chapter 2]). See Chapter 
2] for a chapter-length treatment of doubly stochastic matrices. We shall only 
prove some of their most basic properties. First, some examples: 


Example 8.8.4. For any n > 0, the n x n-matrix 


1 1 1 
1 1 1 
T Geka 


is IN-magic and also R+-magic. This matrix is not doubly stochastic (unless 
n = 1), since the sum of the entries in a row or column is n, not 1. However, 
if we divide this matrix by n, it becomes doubly stochastic. 
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Example 8.8.5. Here is an IN-magic 3 x 3-matrix: 
705 
26 4 
36 3 
Dividing this matrix by 12 gives a doubly stochastic matrix. 


Example 8.8.6. A permutation matrix is an n x n-matrix whose entries are 
0’s and 1’s, and which has exactly one 1 in each row and exactly one 1 in 
001 0 


is a permutation matrix of size 


or 
= 


each column. For example, 0 ; 
0 0 1 


© 


4. 

For any n € N, there are n! many permutation matrices (of size n), since 
they are in bijection with the permutations of {1,2,...,n}. Namely, if ø is 
a permutation of {1,2,...,n}, then the corresponding permutation matrix 
P (c) has its (i, ø (i))-th entries equal to 1 for alli € {1,2,...,n}, while its re- 
maining n^ — n entries are 0. For example, if 7 is the permutation of {1,2,3} 
sending 1,2,3 to 2,3,1, then the corresponding permutation matrix P (7) is 

010 

001 

100 
Any permutation matrix is N-magic, R-magic and doubly stochastic. 


It turns out that these permutation matrices are (in a sense) the “building 
blocks” of all magic (and doubly stochastic) matrices! Namely, the following 
holds: 


Theorem 8.8.7 (Birkhoff-von Neumann theorem). Let n € IN. Then: 


(a) Any IN-magic n x n-matrix can be expressed as a finite sum of permu- 
tation matrices. 


(b) Any R+-magic n x n-matrix can be expressed as an IR+-linear combi- 
nation of permutation matrices (i.e., it can be expressed in the form 
Ay Py + A2Pa + --- + ÀkPk, where Ay,A2,...,Ax E R4 are numbers and 
where Pj, P2,..., Py are permutation matrices). 


(c) Let n > 0. Any doubly stochastic n x n-matrix can be expressed as a 
of permutation matrices (i.e., it can be expressed 
in the form A,P, + A2P2 +--+: +A,xPy, where A1,A2,...,Ax E€ R4 are 
numbers satisfying A; + Az +--+ Àg = 1 and where Pj, Po,..., Py are 
permutation matrices). 


An introduction to graph theory, version August 2, 2023 page 332 


Soon we will sketch a proof of this theorem using the HMT. First, two simple 
results that will be used in the proof. 


Proposition 8.8.8. Let A be an IN-magic or R+-magic n x n-matrix. Then, 
the sum of all entries in a row of A equals the sum of all entries in a column 
of A. 


1 
Proof. Both sums equal z times the sum of all entries of A (since A has n rows 
and n columns). Oo 


Lemma 8.8.9. Let M be an IN-magic or R-magic matrix that is not the zero 
matrix. Then, there exists a permutation £ of {1,2,...,n} such that all entries 
Mioa) Mo0(2), +--+» Mno(n) are nonzero. 


2 7 1 

Example 8.8.10. If n = 3and M = | 0 1 9 |, then the permutation ø that 
8 2 0 

sends 1,2,3 to 3,2,1 has this property. 


Proof of Lemma|8.8.9] Let s denote the sum of the entries in any given row of M 
(it doesn’t matter which row we take, since M is magic). Then, s is also the sum 
of the entries in any given column of M (by Proposition 8.8.8). Also, the sum 
of all entries of M is ns. Hence, ns > 0 (since M has nonnegative entries and is 
not the zero matrix). Thus, s > 0. 

Let X = {1,2,...,n} and Y = {—1,-2,...,—n}. 

Let G be the simple graph with vertex set X U Y and with edges defined as 
follows: A vertex 1 € X shall be adjacent to a vertex —j € Y if and only if 
Mi; > 0 (here, M; j denotes the (i, j)-th entry of M). There shall be no further 
adjacencies. 

Thus, (G, X,Y) is a bipartite graph. 

We shall now prove that it satisfies the Hall condition. That is, we shall prove 
that every subset A of {1,2,...,n} satisfies |N (A)| > |A]. 

Assume the contrary. Thus, there exists a subset A of {1,2,...,n} that satis- 
fies |N (A)| < |A|. Consider this A. WLOG assume that A = {1,2,...,k} for 
some k € {0,1,...,n} (otherwise, we permute the rows of M). Thus, all positive 
entries in the first k rows of A are concentrated in fewer than k columns (since 
the columns in which they lie are the j-th columns for j € N (A), but we have 
IN (A)| < |A| =k). Therefore, the sum of these entries is smaller than ks (since 
the sum of all entries in any given column is s). On the other hand, however, 
the sum of these entries equals ks, because they are all the positive entries in the 
first k rows of A (and the sum of all positive entries in a given row equals the 
sum of all entries in this row, which is s). The two preceding sentences clearly 
contradict each other. This contradiction shows that our assumption was false. 
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Hence, the Hall condition is satisfied. Thus, the HMT yields that G has a 
perfect matching. Let 


{{1,—a1}, {2,—a2}, ..., LOS} 


be this perfect matching. Then, a1, 42,...,d are distinct, so we can find a per- 
mutation o of {1,2,...,n} such that a; = o (i) for alli € {1,2,...,n}. This 
permutation ¢ then satisfies Mj ,(;) > 0 for alli € {1,2,...,n}, which is what 
we wanted. Thus, Lemma is proved. o 


Proof of Theorem [8.8.7] (sketched). (a) Let M be an N-magic n x n-matrix. How 
can we express M as a sum of permutation matrices? 

We can try the following method: Try to subtract a permutation matrix from 
M in such a way that the result will still be an N-magic matrix. Then do this 
again, and again and again... until we reach the zero matrix. Once we have 
arrived at the zero matrix, the sum of all the permutation matrices that we have 
subtracted along the way must be M. 

Let us experience this method on an example: Let n = 3 and] M = 

2 71 
1 9 |. If we subtract a permutation matrix from M, then the resulting 
8 2 
matrix will still satisfy Conditions 2 and 3 of Definition [8.8.1] (since the sum of 
the entries in any row has been decreased by 1, and the sum of the entries in any 
column has also been decreased by 1); however, Condition 1 is not guaranteed, 
since the subtraction may turn an entry of M negative (which is not allowed). 
For example, this would happen if we tried to subtract the permutation matrix 
1 
1 from M. Fortunately, Lemma [8.8.9] tells us that there is a permu- 
1 
tation ø of {1,2,...,n} such that all entries My o0) Mzo(2) ---, Mn,o(n) are 
nonzero. If we choose such a g, and subtract the corresponding permutation 
matrix P (c) from M, then we obtain an IN-magic matrix, because subtract- 
ing 1 from the nonzero entries My ,(1), Mo,o(2), -»»» Mno(n) cannot render any 
of these entries negative. In our example, we can pick ø to be the permuta- 
tion that sends 1,2,3 to 3,2,1. The corresponding permutation matrix P (c) is 


N 


108We are here omitting zero entries from matrices. Thus, 


2 7 1 
01 9]. 
8 2 0 


oe) 
NRN 


1 
9 means the matrix 
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1 
1 . Subtracting this matrix from M, we find 
1 
271 1 2 7 
1 9 |— 1 = 9 
8 2 1 7 2 


This is again an N-magic matrix. Thus, let us do the same to it that we did to 
M: We again subtract a permutation matrix. 
This time, we can actually do better: We can subtract the permutation matrix 


1 aj 
1 | from 9 | not just once, but 7 times, without rendering 
1 7 2 
any entry negative, because the relevant entries 7,9,7 are all > 7. The result is 
2 7 1 2 
9 | -7. 1] = 2 
7 2 1 2 


Now, we follow the same recipe and again subtract a permutation matrix. 
This time, we can do it 2 times, and obtain 


2 1 
2 —2. 1 = = 03x3 


(the zero matrix, in case you’re wondering). 

Thus, we have arrived at the zero matrix by successively subtracting permu- 
tation matrices from M. Hence, M is the sum of all the permutation matrices 
that have been subtracted: namely, 


1 1 1 
M = 1 +7. 1 | +2- 1j, 


which is a sum of 1 +7 + 2 permutation matrices. 
This method works in general, because: 


e If M is an N-magic matrix that is not the zero matrix, then Lemma [8.8.9 
tells us that there is a permutation o of {1,2,...,n} such that all entries 
Mio) Mzo(2) «++» Mno(n) are nonzero. We can then choose such a 7 
and subtract the corresponding permutation matrix P (o) from M. 


e Better yet, we can subtract m - P (o) from M, where 


m= min { Mi o0) Morapeso Mno(n) }- 
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This results in an IN-magic matrix (since the sum of the entries decreases 
by m in each row and by m in each column, and since we are only sub- 
tracting m from a bunch of entries that are > m) that has at least one 
fewer nonzero entry than M (since at least one of the nonzero entries 
Mioa) M2,0(2), --- Mno(n) becomes 0 when m is subtracted from it). 


e This way, in each step of our process, the number of nonzero entries of 
our matrix decreases by at least 1 (but the matrix remains an IN-magic 
matrix throughout the process). Hence, we eventually (after at most n? 
steps) will end up with the zero matrix. 


This proves Theorem (a). 


(b) This is analogous to the proof of part (a) (but this time, we have to sub- 
tract m-P(c) rather than P (oc) in our procedure, since the nonzero entries 
Miga) Moo(2), ---» Muon) are not necessarily > 1). 


(c) Let M be a doubly stochastic n x n-matrix. Then, M is also IR-magic. 
Hence, part (b) shows that M can be expressed in the form A,P; + A2P2 + 
--+ + ÀP, where Ay, A2,...,Ax E R+ are numbers and where Pj, P2,..., Pr are 
permutation matrices. Consider these Aj, A2,...,A, and these Pj, Po,..., Pr. 

Now, consider the sum of all entries in the first row of M. It is easy to see that 
this sum is Ay + A2 +---+A,z (because M = A1 P4 + A2 Pa +--+ + ÀAkPk, but each 
permutation matrix P; contributes a 1 to the sum of all entries in the first row). 
But we know that this sum is 1, since M is doubly stochastic. Comparing these, 
we conclude that A; + A2+---+A, = 1. Thus, we have expressed M in the form 


Ay Py + A2P2 +--+ +AxPe, where Ay,A2,...,Ax E IRy are numbers satisfying 
Ay HA2 +: + Àk = 1 and where P, Po,...,P, are permutation matrices. This 
proves Theorem (c). O 


8.9. Further uses of Hall’s marriage theorem 


The following few exercises illustrate other applications of Hall’s marriage the- 
orem: 


Exercise 8.3. Let X and Y be two finite sets such that |X| < |Y|. Let f : X + Y 
be a map that is not constant. (A map is said to be constant if all its values 
are equal.) Prove that there exists an injective map g : X — Y such that each 
x € X satisfies g(x) A f (x). 


Exercise 8.4. Let A and B be two finite sets such that |B| > |A|. Let dj; be a 
real number for each (i,j) € A x B. Let 


mı = min max d; yi 
nAi injective icA #7) 


and 


dit: 


m = max mi i,j 


in 
ICA; JCB; (i,j)eIx] 
[Z|+|J|=|B|+1 
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(The notation “mingome kind of objects Some kind of value” means the minimum 
of the given value over all objects of the given kind. An analogous notation 
is used for a maximum.) Prove that mı = mp. 


Exercise 8.5. Let G = (V, E) be a simple graph such that |E| > |V|. Show 
that there exists an injective map f : V — E such that for each vertex v € V, 
the edge f (v) does not contain v. 

(In other words, show that we can assign to each vertex an edge that does 
not contain this vertex in such a way that no edge is assigned twice.) 


[Remark: This is, in some sense, an “evil twin” to Exercise However, 
it requires a simple graph, not a multigraph, since a multigraph with a single 
vertex and a single loop would constitute a counterexample. Incidentally, 
Exercise [5.17|can also be solved using Hall’s marriage theorem.] 


[Solution: This is Exercise 1 on midterm #2 from my Spring 2017 course; 


see the course page for solutions. ] 


Exercise 8.6. Let S be a finite set, and let k € IN. Let A4, A2,..., A, be k 
subsets of S such that each element of S lies in exactly one of these k subsets. 
Prove that the following statements are equivalent: 


e Statement 1: There exists a bijection 0 : S — S such that each i € 
{1,2,...,k} satisfies 7 (Aj) N Aj = Ø. 


e Statement 2: Fach i € {1,2,...,k} satisfies |A;| < |S| /2. 


[Solution: This is Exercise 5 on homework set #4 from my Spring 2017 


course; see the course page for solutions.] 


Exercise 8.7. Let S be a finite set. Let k € N be such that |S| > 2k+4 1. 
Prove that there exists an injective map f : Py (S) > Pk+1 (S) such that each 
X € Px (S) satisfies f (X) 2 X. 

(In other words, prove that we can add to each k-element subset X of S an 
additional element from S \ X such that the resulting (k + 1)-element subsets 
are distinct.) 


[Example: For S = {1,2,3,4,5} and k = 2, we can (for instance) have the 
map f send 


{1,2} > {1,2,4}, {1,3} + {1,3,4}, {1,4} > {1,4,5}, 
{1,5} = {1,3,5}, {2,3} + {1,2,3}, {2,4} + {2,4,5}, 
{2,5} => {1,2,5}, {3,4} > {2,3,4}, {3,5} > {2,3,5}, 
{4,5} > {3,4,5}. 


Do you see any pattern behind these values? (I don’t).] 
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[Hint: First, reduce the problem to the case when |S| = 2k+ 1. Then, in 
that case, restate it as a claim about matchings in a certain bipartite graph.] 


[Solution: This is Exercise 4 on homework set #4 from my Spring 2017 


course; see the course page for solutions.] 


Exercise 8.8. Let (G, X,Y) be a bipartite graph. Assume that each subset 
A of X satisfies |N (A)| > |A|. (Thus, Theorem [8.3.4] shows that G has an 
X-complete matching.) 

A subset A of X will be called neighbor-critical if |N (A)| = |A]. 

Let A and B be two neighbor-critical subsets of X. Prove that the subsets 
AUB and ANB are also neighbor-critical. 


[Solution: This is Exercise 6 on homework set #4 from my Spring 2017 


course; see the course page for solutions. ] 


8.10. Further exercises on matchings 


Exercise 8.9. Let G = (V, E, pọ) be a multigraph. Let M be a matching of G. 
An augmenting path for M shall mean a path (vo, €1, 01, €2, V2, ..+,€k, Ug) Of 
G such that k is odd (note that k = 1 is allowed) and such that 


e the even-indexed edges e2,e4,...,e,—1 belong to M (note that this con- 
dition is vacuously true if k = 1); 


e the odd-indexed edges e1, e3, . . . , ey belong to E \ M; 


e neither the starting point vo nor the ending point v% is matched in M. 


Prove that M has maximum size among all matchings of G if and only if 
there exists no augmenting path for M. 


[Hint: If M and M’ are two matchings of G, what can you say about the 
symmetric difference (M U M’) \ (MN M’) ?] 


Exercise 8.10. Let (G, X, Y) be a bipartite graph. Let A be a subset of X, and 
let B be a subset of Y. Assume that G has an A-complete matching, and that 
G has a B-complete matching. Prove that G has an A U B-complete matching. 


Exercise 8.11. Let (G, X,Y) be a bipartite graph with X # Ø. Assume that G 
has an X-complete matching. 

An edge e of G will be called useless if G has no X-complete matching 
that contains e. 

Prove that there exists a vertex x € X such that no edge that contains x is 
useless. 
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Exercise 8.12. Let (G, X,Y) be a bipartite graph such that |Y| > 2|X|—1. 
Prove that there exists an injective map f : X — Y such that each x € X 
satisfies one of the following two statements: 


e Statement 1: The vertices x and f (x) of G are adjacent. 


e Statement 2: There exists no x’ € X such that the vertices x and f (x’) of 
G are adjacent. 


[Remark: In vaguely matrimonial terminology, this is saying that in a 
group of m men and w women satisfying w > 2m — 1, we can always marry 
each man (monogamously) to a woman in such a way that either he likes his 
partner or all women he likes are unmarried.] 


[Solution: This is Exercise 3 on midterm #3 from my Spring 2017 course; 


see the course page for solutions.] 


Exercise 8.13. Let (G, X,Y) and (H,U, V) be bipartite graphs. 
Assume that G is a simple graph and has an X-complete matching. 
Assume that H is a simple graph and has a U-complete matching. 
Consider the Cartesian product G x H of G and H defined in Definition 
(Note that we required G and H to be simple graphs only to avoid 
having to define G x H for multigraphs.) 


(a) Show that (G x H, (X x V)U(Y x U), (X x U) U(Y x V)) is a bipar- 
tite graph. 


(b) Prove that the graph G x H has an (X x V) U(Y x U)-complete match- 
ing. 


[Solution: This is Exercise 3 on homework set #4 from my Spring 2017 


course; see the course page for solutions.] 


9. Networks and flows 


In this chapter, I will give an introduction to network flows and their optimiza- 
tion. This is a topic of great interest to logisticians, as even the simplest results 
have obvious applications to scheduling trains and trucks. It also has lots of 
purely mathematical consequences; in particular, we will use network flows to 
finally prove the Hall-—K6nig matching theorem (and thus the HMT, Konig’s 
theorem, and their many consequences). 

I will follow my notes [17s-lec16], which are a good place to look up the 
details of some proofs that I will only sketch. That said, I will be using multidi- 
graphs instead of simple digraphs, so some adaptations will be necessary (since 


An introduction to graph theory, version August 2, 2023 page 339 


|17s-lec16] only works with simple digraphs). These adaptations are generally 
easy. 

I will only cover the very basics of network flow optimization, leading to a 
proof of the max-flow-min-cut theorem (for integer-valued flows) and to a proof 
of the Hall-K6nig matching theorem. For the deeper reaches of the theory, 
see [ForFul74] (a classical textbook written by the inventors of the subject), 


ISchrij17, Chapter 4] and [Schrij03) Part I]. 


9.1. Definitions 
9.1.1. Networks 
Recall that we use the notation N = {0,1,2,...}. 


Definition 9.1.1. A network consists of 
e amultidigraph D = (V, A, 4); 


e two distinct vertices s € V and t € V, called the source and the sink, 
respectively; 


e a function c : A + N, called the capacity function. 


Example 9.1.2. Here is an example of a network: 


Here, the multidigraph D is the one we drew (it is a simple digraph, so we 
have not labeled its arcs); the vertices s and t are the vertices labeled s and 
t; the values of the function c on the arcs of D are written on top of these 
respective arcs (e.g., we have c ((s,p)) = 3 and c((u,q)) = 1). 


Remark 9.1.3. The digraph D in Example D.1.2] has no cycles and satisfies 
deg s = deg’ t = 0. This is not required in the definition of a network, 
although it is satisfied in many basic applications. 

Also, all capacities c (a) in Example 9.1.2] were positive. This, too, is not 
required; however, arcs with capacity 0 do not contribute anything useful to 
the situation, so they could just as well be absent. 
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Remark 9.1.4. The notion of “network” we just defined is just one of a myriad 
notions of “network” that can be found all over mathematics. Most of them 
can be regarded as graphs with “some extra structures”; apart from this, they 
don’t have much in common. 


9.1.2. The notations S, [P,Q] and d(P,Q) 


Definition 9.1.5. Let N be a network consisting of a multidigraph D = 
(V,A,), a source s € V, a sink t € V and a capacity function c : A > N. 
Then: 


(a) For any arc a € A, we call the number c (a) € N the capacity of the arc 
a. 


(b) For any subset S of V, we let S denote the subset V \ S of V. 


(c) If P and Q are two subsets of V, then [P, Q] shall mean the set of all 
arcs of D whose source belongs to P and whose target belongs to Q. 
That is, 


[P,Q] := {a € A | pla) € P x Q}. 


(d) If P and Q are two subsets of V, and if d : A — N is any function, then 
the number d (P, Q) € N is defined by 


d(P,Q):= Yo d(a). 


ae [P,Q] 
(In particular, we can apply this to d = c, and then get c(P,Q) = 
Ł c(a).) 


a€|P,Q] 
Example 9.1.6. Let us again consider the network from Example For 


the subset {s,u} of V, we have {s,u} = {p,v,q,t} and 


{su}, {s,u}] = {sp, uv, ug} 


(recall that our D is a simple digraph, so an arc is just a pair of two vertices) 
and 


c({su},{su})= E c(a) =c (sp) +c (uv) +e (uq) 
ae [{su}, {5a eo a 
=34+1+1=5. 
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We can make this visually clearer if we draw a “border” between the sets 
{s,u} and {s,u}: 


Then, {s, u},{s, u} | is the set of all arcs that cross this border from {s,u} to 


{s,u}. (Of course, this visualization works only for sums of the form d (P, P), 
not for the more general case of d (P,Q) where P and Q can have elements 
in common. But the d (P, P) are the most useful sums.) 


Exercise 9.1. Let D = (V, A, y) be a balanced multidigraph. For any subset 
S of V, we set S := V \ S. For any two subsets S and T of V, we set 


[S,T] := {a € A | the source of a belongs to S, 
and the target of a belongs to T}. 


Prove that | [S, S] | = | E S] | for any subset S of V. 
9.1.3. Flows 
Let us now define flows on a network: 


Definition 9.1.7. Let N be a network consisting of a multidigraph D = 
(V, A, 4), a source s € V, a sink t € V and a capacity function c: A > N. 
A flow (on the network N) means a function f : A — N with the following 
properties: 
e We have 0 < f (a) < c (a) for each arc a € A. This condition is called 
the capacity constraints (we are using the plural form, since there is 
one constraint for each arc a € A). 


e For any vertex v € V \ {s,t}, we have 
f ©) =f" (0), 
where we set 


f= } f(a) and fro)= Yi f(a). 


a€A is an arc a€ A is an arc 
with target v with source v 
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This is called the conservation constraints. 


If f: A > Nis a flow and a € A is an arc, then the nonnegative integer 
f (a) will be called the arc flow of f on a. 


Example 9.1.8. To draw a flow f on a network N, we draw the network 
N, with one little tweak: Instead of writing the capacity c (a) atop each arc 
a € A, we write “f (a) of c(a)” atop each arc a € A. For example, here is a 
flow f on the network N from Example 


(so, for example, f (su) = 2, f (pq) = 1 and f (qv) = 0). 
For another example, here is a different flow g on the same network N: 


There are several intuitive ways to think of a network N and of a flow on it: 


e We can visualize N as a collection of one-way roads: Each arc a € A is a 
one-way road, and its capacity c (a) is how much traffic it can (maximally) 
handle per hour. A flow f on N can then be understood as traffic flow- 
ing through these roads, where f (a) is the amount of traffic that travels 
through the arc a in an hour. The conservation constraints say that the 
traffic out of a given vertex v equals the traffic into v unless v is one of s 
and t. (We imagine that traffic can arbitrarily materialize or dematerialize 
at s and t.) 


e We can visualize N as a collection of pipes: Each arc a € A is a pipe, 
and its capacity c (a) is how much water it can maximally transport in a 
second. A flow f on N can then be viewed as water flowing through the 
pipes, where f (a) is the amount of water traveling through a pipe a ina 
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second. The capacity constraints say that no pipe is over its capacity or 
carries a negative amount of water. The conservation constraints say that 
at every vertex v other than s and t, the amount of water coming in (that 
is, f~ (v)) equals the amount of water moving out (that is, f* (v)); that 
is, there are no leaks and no water being injected into the system other 
than at s and t. This is why s is called the “source” and t is the “sink”. A 
slightly counterintuitive aspect of this visualization is that each pipe has 
a direction, and water can only flow in that one direction (from source to 
target). That said, you can always model an undirected pipe by having 
two pipes of opposite directions. 


e We can regard N as a money transfer scheme: Each vertex v € V is a bank 
account, and the goal is to transfer some money from s to t. All other 
vertices v act as middlemen. Each arc a € A corresponds to a possibility 
of transfer from its source to its target; the maximum amount that can be 
transferred on this arc is c(a). A flow describes a way in which money 
is transferred such that each middleman vertex v € V \ {s,t} ends up 
receiving exactly as much money as it gives away. 


Needless to say, these visualizations have been chosen for their intuitive 
grasp; the real-life applications of network flows are somewhat different. 


Remark 9.1.9. Flows on a network N can be viewed as a generalization of 
paths on the underlying digraph D. Indeed, if p is a path from s to t on the 
digraph D = (V,A,w) underlying a network N, then we can define a flow 
fp on N as follows: 


for eacha € A, 


1, if ais an arc of p; 
a)= 
fol ) F otherwise 


provided that all arcs of p have capacity > 1. An example of such a flow is 
the flow g in Example 


9.1.4. Inflow, outflow and value of a flow 
Next, we define certain numbers related to any flow on a network: 


Definition 9.1.10. Let N be a network consisting of a multidigraph D = 
(V, A, 4), a source s € V, a sink t € V and a capacity function c : A > N. 
Let f : A > N be an arbitrary map (e.g., a flow on N). Then: 


(a) For each vertex v € V, we set 


f@= Le f(a) and fr@= f(a). 


a€A is an arc a€A is an arc 
with target v with source v 
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We call f~ (v) the inflow of f into v, and we call f* (v) the outflow of 
f from v. 


(b) We define the value of the map f to be the number f* (s) — f~ (s). This 
value is denoted by |f|. 


Example 9.1.11. The flow f in Example satisfies 


) 
f (uw =f (u) =2, 
fp) =f PHL 
Fee Ek 
fa =f @=2, 
f* (t) =0, f(t) =3 


and has value |f| = 3. The flow g in Example has value |g| = 1. More 
generally, the flow fp in Remark always has value |fp| = 1. 


Example 9.1.12. For any network N, we can define the zero flow on N. This 
is the flow 04 : A — N that sends each arc a € A to 0. This flow has value 
04| = 0. 


9.2. The maximum flow problem and bipartite graphs 


Now we can state an important optimization problem, known as the maximum 
flow problem: Given a network N, how can we find a flow of maximum possi- 
ble value? 


Example 9.2.1. Finding a maximum matching in a bipartite graph is a par- 
ticular case of the maximum flow problem. 

Indeed, let (G, X,Y) be a bipartite graph. Then, we can transform this 
graph into a network N as follows: 


e Add two new vertices s and t. 


e Turn each edge e of G into an arc Z whose source is the X-endpoint of 
e (that is, the endpoint of e that belongs to X) and whose target is the 
Y-endpoint of e (that is, the endpoint of e that belongs to Y). 


e Add an arc from s to each vertex in X. 


e Add an arc from each vertex in Y to t. 
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e Assign to each arc the capacity 1. 


Here is an example of a bipartite graph (G, X,Y) (as usual, drawn with 
the X-vertices on the left and with the Y-vertices on the right) and the corre- 
sponding network N: 


(we are not showing the capacities of the arcs, since they are all equal to 1). 
The flows of this network N are in bijection with the matchings of G. 
Namely, if f is a flow on N, then the set 


{e€E(G) | f(¢) =1} 


is a matching of G. Conversely, if M is a matching of G, then we obtain a flow 
f on N by assigning the arc flow 1 to all arcs of the form Z where e € M, as 
well as assigning the arc flow 1 to every new arc that joins s or t to a vertex 
matched in M. All other arcs are assigned the arc flow 0. For instance, in our 
above example, the matching {15, 36} corresponds to the following flow: 


where we are using the convention that an arc a with f (a) = 0 is drawn 
dashed whereas an arc a with f (a) = 1 is drawn boldfaced (thankfully, the 
only possibilities for f (a) are 0 and 1, because all capacities are 1). 

One nice property of this bijection is that if a flow f corresponds to a 
matching M, then |f| = |M]. Thus, finding a flow of maximum value means 
finding a matching of maximum size. 
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(See [17s-lec16) Proposition 1.36 till Proposition 1.40] for details and proofs; 
that said, the proofs are straightforward and you will probably “see” them 
just by starting at an example.) 


9.3. Basic properties of flows 


Before we approach the maximum flow problem, let us prove some simple 
observations about flows: 


Proposition 9.3.1. Let N be a network consisting of a multidigraph D = 
(V,A,), a source s € V, a sink t € V and a capacity function c : A > N. 
Let f : A — N bea flow on N. Then, 


Ifl =f" (8) — f(s) 
=f ()- fT (t). 


Proof. We have 


D ft @= D D f(a) since f+ (v) is defined as D f(a) 


vEV vEV acA isan arc a€ A is an arc 
with source v with source v 
=) 
acA 
=}, f(a 
acA 


(note that this is a generalization of the familiar fact that X deg’ v = |A)). 
veV 


Similarly, X} f~ (v) = ¥ f(a). Hence, 
vEeV acA 


Lof eL 7 @- Lr osn (46) 


vEV vEV vEV 
— 
=} f(a) =>} f(a) 
acA acA 


However, by the conservation constraints, we have f~ (v) = f+ (v) for each 

v € V \{s,t}. In other words, f~ (v) — f* (v) = 0 for each v € V \ {s,t}. Thus, 

in the sum £ (f~ (v) — f* (v)), all addends are 0 except for the addends for 
veV 


v = s and for v = t. Hence, the sum boils down to these two addends: 


LF M-fO)=F ISF eE O-f7@). 


vEeV 
Comparing this with (46), we obtain 


(fF (8) -f (9) + F O-F ) = o, 
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so that 
f@-fO=-F ©) -f e) =f s) -f s) I= IFI 
(by the definition of |f|). This proves Proposition [9.3.1 O 


Proposition 9.3.2. Let N be a network consisting of a multidigraph D = 
(V,A,w), a source s € V, a sink t € V and a capacity function c : A > N. 
Let f : A N bea flow on N. Let S be a subset of V. Then: 


(a) We have E B 
f (8,8) —f (5S) =E =r @). 
ves 
(Recall that we are using Definition 9.1.5] here, so that f (P,Q) means 
L f(a) 
ac[P,Q] 


(b) Assume that s € S and t ¢ S. Then, 
IfI =f (5,5) — f (5,8). 
(c) Assume that s € S and t ¢ S. Then, 
Ifl < e (S,5). 
(d) Assume that s € S and t ¢ S. Then, |f| = c (S, S) if and only if 
(f (a) =0 for all a € [S,S]) 


and _ 
(f (a) =c(a) for alla € [S,S]). 


Proof. Let me first make these claims intuitive in terms of the “money transfer 
scheme” model for our network. Consider S as a country. Then, f (S,S) is the 
“export” from this country S (that is, the total wealth exported from S), whereas 
f (S,S) is the “import” into this country S (that is, the total wealth imported 
into S). Thus, part (a) of the proposition is saying that the “net export” of S (that 
is, the export from S minus the import into S) can be computed by summing 
the “outflow minus inflow” values of all accounts in S. This should match 
the intuition for exports and imports (in particularly, any transfers that happen 
within S should cancel out when we sum the “outflow minus inflow” values of 
all accounts in S). Part (b) says that if the country S contains the source s but 
not the sink t (that is, the goal of the network is to transfer money out of the 
country), then the total value transferred is actually the net export of S. Part (c) 
claims that this total value is no larger than the total “export capacity” c (S,S) 
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(that is, the total capacity of the “export arcs” a € [S,S]). Part (d) says that if 
equality holds in this inequality (i.e., if the total value equals the total export 
capacity), then each “import arc” a € [S, S] is unused (i.e., nothing is imported 
into S), whereas each “export arc” a € [S,S] is used to its full capacity. 


I hope this demystifies all claims of the proposition. But for the sake of 
completeness, here are rigorous proofs (though rather terse ones, since I assume 
you have seen enough manipulations of sum to fill in the details): 


(a) This follows from 


LP OHr CO) =F Oe @) 


vES vES vES 
— aaa eo” 
=f(S,V) =f(V,S) 
(why?) (why?) 
= f(S,V) z £(V,S) 
— —— 
=f(S,S)+f(S,5) =f(S,S)+f(S,S) 
(since V is the union of the (since V is the union of the 
two disjoint sets S and S) two disjoint sets S and S) 


=7(5:8)47 65) GS) 47 68) =f 5) = ©): 
(b) We have S \ {s} C V \ {s,t} (since t ¢ S). From part (a), we obtain 


F(S,5) —f (5,5) = UF (@) -F @)) 


ves 

=(ft()-f-()) + } (f* (w) -f (2)) (since s € S) 
a a ves} O_o 
(by the definition of |f|) (by ee ie 

=f 


This proves part (b). 


(c) The capacity constraints yield that f (a) < c(a) for each arc a € A. Summing 
up these inequalities over all a € ES we obtain f (S,S) <c¢ (S, 8): The capacity 
constraints furthermore yield that f (a) > 0 for each arc a € A. Summing up these 
inequalities over all a € [S, S| , we obtain f (S, S ) > 0. Hence, part (b) yields 


IFI = £ (S/S) = f (5,8) < € (S,5) . 


<e(S,5) 20 


This proves part (c). 


(d) We must characterize the equality case in part (c). However, recall the proof 
of part (c): We obtained the inequality |f| < c(S,S) by summing up the inequalities 
f (a) < c (a) over all arcs a € [S,5] and subtracting the sum of the inequalities f (a) > 0 
over all arcs a € [S,S]. Hence, in order for the inequality |f| < c(S,S) to become 
an equality, it is necessary and sufficient that all the inequalities involved — i.e., the 
inequalities f (a) < c (a) for all arcs a € [S,S] as well as the inequalities f (a) > 0 for 
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all arcs a € [S, S] — become equalities. In other words, it is necessary and sufficient 
that we have _ 
(f (a) = 0 for alla € [S,S]) 


and 
(f (a) =c (a) for alla € [S,5]). 
This proves Proposition [9.3.2] (d). O 


9.4. The max-flow-min-cut theorem 
9.4.1. Cuts and their capacities 


One more definition, before we get to the hero of this story: 


Definition 9.4.1. Let N be a network consisting of a multidigraph D = 
(V, A, 4), a source s € V, a sink t € V and a capacity function c : A > N. 
Then: 


(a) A cut of N shall mean a subset of A that has the form [S, Sh where S 
is a subset of V satisfying s € S and t ¢ S. 


(b) The capacity of a cut [S,S] is defined to be the number c (S,S) = 
L c(a). 


ae[S,5] 


Example 9.4.2. Let us again consider the network from Example Then, 
{sw} , {su} = {sp, uv, uq} is a cut of this network, and its capacity is 


c ({s,w} {s,u}) = 5. 


9.4.2. The max-flow-min-cut theorem: statement 


Now, Proposition 9.3.2] (c) says that the value of any flow f can never be larger 
than the capacity of any cut |S, S|. Thus, in particular, the maximum value of 
a flow is < to the minimum capacity of a cut. 

Furthermore, Proposition [9.3.2](d) says that if this inequality is an equality — 
i.e., if the value of some flow f equals the capacity of some cut [S,5] —, then 
the flow f must use each arc that crosses the cut in the right direction (from S 
to S) to its full capacity and must not use any of the arcs that cross the cut in 
the wrong direction (from S to S). 

It turns out that this inequality actually is an equality for any maximum flow 
and any minimum cut: 
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Theorem 9.4.3 (max-flow-min-cut theorem). Let N be a network consisting 
of a multidigraph D = (V, A, 4), a source s € V, a sink t € V and a capacity 
function c : A — N. Then, 


max {|f| | f is a flow} = min {c (S,5) | SC V; s E S; t ¢ S}. 


In other words, the maximum value of a flow equals the minimum capacity 
of a cut. 


We shall soon sketch a proof of this theorem that doubles as a fairly efficient 
(polynomial-time) algorithm for finding both a maximum flow (i.e., a flow of 
maximum value) and a minimum cut (i.e., a cut of minimum capacity). The 
algorithm is known as the Ford-Fulkerson algorithm, and is sufficiently fast to 
be useful in practice. 


9.4.3. How to augment a flow 


The idea of this algorithm is to start by having f be the zero flow (i.e., the 
flow from Example 9.1.12), and then gradually increase its value |f| by making 
changes to some of its arc flows f (a). 

Of course, we cannot unilaterally change the arc flow f (a) on a single arc, 
since this will (usually) mess up the conservation constraints. Thus, if we 
change f (a), then we will also have to change f (b) for some other arcs b € A 
to make the result a flow again. One way to do this is to increase all arc flows 
f (a) along some path from s to t. Here is an example of such an increase: 


Example 9.4.4. Consider the flow f from Example We can increase 
the arc flows f (sp), f (pq), f (qv), f (vt) of f on all the arcs of the path 
(s,p,g,v,t) (since neither of these arcs is used to its full capacity). Asa 
result, we obtain the following flow h: 


whose value |h| is 4. It is easy to see that this is actually the maximum value 


of a flow on our network (since |h| = 4 equals the capacity c (T {t}) of the 


cut [g t , but Proposition (c) tells us that the value of any flow is 
< to the capacity of any cut). 
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However, simple increases like the one we just did are not always enough to 
find a maximum flow. They can leave us stuck at a “local maximum” — i.e., at a 
flow which does not have any more paths from s to t that can be used for any 
further increases (i.e., any path from s to t contains an arc that is already used 
to its full capacity), yet is not a maximum flow. Here is an example: 


Example 9.4.5. Consider the following network and flow: 


This flow is not maximum, but each path from s to t has at least one arc that 
is used to its full capacity. Thus, we cannot improve this flow by increasing 
all its arc flows on any given path from s to t. 


The trick to get past this hurdle is to use a “zig-zag path” — i.e., not a 
literal path, but rather a sequence (vo, 41, V1, 42, 02, . .-, Ak, Ux) Of vertices and 
arcs that can use arcs both in the forward and backward directions (i.e., any 
i € {1,2,...,k} has to satisfy either y (a;) = (vj-1,0;) or yọ (ai) = (0;,0;-1)). In- 
stead of increasing the flow on all arcs of this “path”, we do something slightly 
subtler: On the forward arcs, we increase the flow; on the backward arcs, we 
decrease it (all by the same amount). This, too, preserves the conservation con- 
straints (think about why; we will soon see a rigorous proof), so it is a valid 
way of increasing the value of a flow. Here is an example: 


Example 9.4.6. Consider the flow in Example The underlying digraph 
has a “zig-zag path” (s,p,q,u,v,t), which uses the arc ug in the backward 
direction. We can decrease the arc flows of f on all forward arcs sp, pq, uv 
and vt of this “zig-zag path”, and decrease it on the backward arc ug. As a 
result, we obtain the flow 


1 of 1 


1 of 1 


This new flow has value 2, and can easily be seen to be a maximum flow. 
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Good news: Allowing ourselves to use “zig-zag paths” like this (rather than 
literal paths only), we never get stuck at a non-maximum flow; we can always 
increase the value further and further until we eventually arrive at a maximum 
flow. 

In order to prove this, we introduce some convenient notations. We prefer 
not to talk about “zig-zag paths”, but rather reinterpret these “zig-zag paths” 
as (literal) paths of an appropriately chosen digraph (not D). This has the 
advantage of allowing us to use known properties of paths without having to 
first generalize them to “zig-zag paths”. 


9.4.4. The residual digraph 


The appropriately chosen digraph is the so-called residual digraph of a flow; 
it is defined as follows: 


Definition 9.4.7. Let N be a network consisting of a multidigraph D = 
(V, A, 4), a source s € V, a sink t € V and a capacity functionc: A > N. 


(a) For each arc a € A, we introduce a new arc a~!, which should act like a 

reversal of the arc a (that is, its source should be the target of a, and its 
target should be the source of a). We don’t add these new arcs a! to 
our digraph D, but we keep them ready for use in a different digraph 
(which we will define below). 
Here is what this means in rigorous terms: For each arc a € A, we 
introduce a new object, which we call a~!. We let A~! be the set of 
these new objects a~! for a € A. We extend the map p: A > V x V to 
a map p: AUA! = V x V as follows: For each a € A, we let 


Y (a) = (u,v) and Y (a~) = (v,u), 
where u and v are defined by (u,v) = y (a). 


For each arc a € A, we shall refer to the new arc a7! as the reversal of 
a, and conversely, we shall refer to the original arc a as the reversal of 


a71. We set (a-1) := a for each a € A. 


We shall refer to the arcs a € A as forward arcs, and to their reversals 
a`! as backward arcs. 


(b) Let f : A + N be any flow on N. We define the residual digraph Dẹ 
of this flow f to be the multidigraph (V, Aș, Yf), where 


Ap={aeA | f(a) <c(a)}uf{at | ae A; f(a) >o} 


and pr i= P la p (This is usually not a subdigraph of D.) Thus, the 
residual digraph Dy has the same vertices as V, but its arcs are those 
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arcs of D that are not used to their full capacity by f as well as the 
reversals of all arcs of D that are used by f. 


Example 9.4.8. Let f be the flow f from Example Then, the residual 
digraph Dy is 


Notice that the digraph Dy has cycles even though D has none! 


Example 9.4.9. Let f be the non-maximum flow from Example Then, 
the residual digraph Dy is 


This digraph Dy has a path from s to t, which corresponds precisely to the 
“zig-zag path” (s, p,q,u,v,t) we found in Example 


You can think of the residual digraph Dy as follows: Each arc of Dẹ corre- 
sponds to an opportunity to change an arc flow f (a); namely, a forward arc 
a of Dr means that f (a) can be increased, whereas a backward arc a`! of D f 
means that f (a) can be decreased. Hence, the paths of the residual digraph Dr 
are the “zig-zag paths” of D that allow the flow f to be increased (on forward 
arcs) or decreased (on backward arcs) as in Example [9.4.6] Thus, using D fı we 
can avoid talking about “zig-zag paths”. 
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9.4.5. The augmenting path lemma 


The following crucial lemma tells us that such “zig-zag path increases” are 
valid (i.e., turn flows into flows), and are sufficient to find a maximum flow 
(i.e., if no more “zig-zag path increases” are possible, then our flow is already 
maximal): 


Lemma 9.4.10 (augmenting path lemma). Let N be a network consisting of 
a multidigraph D = (V,A,w), a source s € V, a sink t € V and a capacity 
function c : A — N. Let f : A > N be a flow. 


(a) If the digraph Dy has a path from s to t, then the network N has a flow 
f’ with a larger value than f. 


(b) If the digraph Dy has no path from s to t, then the flow f has maximum 
value (among all flows on N), and there exists a subset S of V satisfying 
s € S and t ¢ Sand |f| = c (S,S). 


Proof. (a) Assume that the digraph Dy has a path from s to t. Pick such a path, 
and call it p. Each arc of p is an arc of Dy. 

For each forward arc a € A that appears in p, we have f (a) < c (a) (since a is 
an arc of D¢), and thus we can increase the arc flow f (a) by some positive € € N 
(namely, by any e < c (a) — f (a)) without violating the capacity constraints [09] 

For each backward arc a~! € A~! that appears in p, we have f (a) > 0 (since 
a‘ is an arc of D f), and thus we can decrease the arc flow f (a) by some positive 
e € N (namely, by any e > f (a)) without violating the capacity constraints. 

Let now 


€:= min ( {c(a) — f (a) | a € Aisa forward arc that appears in p} 


U a a! € A! is a backward arc that appears in ; 
PP P 


This £ is a positive integer (since it is a minimum of a set of positive integerd!9). 
Let f’ : A > N be the map obtained from f as follows: 


e For each forward arc a € A that appears in p, we increase the arc flow 
f (a) by e (that is, we set f’ (a) := f (a) + £). 


109OF course, such a unilateral increase will likely violate the conservation constraints. 
M0because 


e for each forward arc a € A that appears in p, we have f (a) < c (a) and thus c (a) — 
f (a) > 0; 


e for each backward arc a~! € A`! that appears in p, we have f (a) > 0. 
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e For each backward arc a~! € A`! that appears in p, we decrease the arc 
flow f (a) by e (that is, we set f’ (a) := f (a) — £). 


e For all other arcs a of D, we keep the arc flow f (a) unchanged (i.e., we 


set f' (a) := f (a). 


This new map f’ still satisfies the capacity constraintd!!|_ We claim that it 
also satisfies the conservation constraints. To check this, we have to verify that 
(F) (v) = (f) (v) for each vertex v € V \ {s,t}. So let us do this. 

Let v € V \ {s,t} be a vertex. We know that f~ (v) = f+ (v) (since f is a 
flow). We must prove that (f’)” (v) = (f’)” (v). 

The path p is a path from s to t. Thus, it neither starts nor ends at v (since 
v € V \ {s,t}). Hence, if v is a vertex of p, then the path p enters v by some arc 
and exits v by another. Hence, we are in one of the following five cases: 

Case 1: The vertex v is not a vertex of the path p. 

Case 2: The path p enters v by a forward arc and exits v by a forward arc. 

Case 3: The path p enters v by a forward arc and exits v by a backward arc. 

Case 4: The path p enters v by a backward arc and exits v by a forward arc. 

Case 5: The path p enters v by a backward arc and exits v by a backward arc. 

Now, we can prove (f’) (v) = (f’)* (v) in each of these five cases by hand. 
Here is how this can be done in the first three cases: 

First, we consider Case 1. In this case, v is not a vertex of the path p. Hence, 
each arc a € A with target v satisfies f’ (a) = f (a) (because neither a nor a`! 
appears in p). Therefore, (f’) (v) = f~ (v). Similarly, (f’)* (v) = ft (v). 
Hence, (f) (v) = fF (w) = fto) = (ou (v). Thus, we have proved 
(f’)” (7) = (f')” (@) in Case 1. 

Let us now consider Case 2. In this case, the path p enters v by a forward arc 
and exits v by a forward arc. Let b be the former arc, and c the latter. Then, both 
b and c are arcs of D, and the vertex v is the target of b and the source of c. The 
definition of f’ yields that f’ (b) = f (b) + e, whereas each other arc a € A with 
target v satisfies f’ (a) = f (a). Hence, (f') (v) = f~ (v) +e. Similarly, using 
the arc c, we see that (f’)* (v) = f+ (v) +e. Hence, (f’)” (v) = f7 (v) +e = 


=f+(v) 
f+ (v) +e =(f’)* (v). Thus, we have proved (f’)~ (v) = (f’)* (v) in Case 2. 
Let us next consider Case 3. In this case, the path p enters v by a forward arc 
and exits v by a backward arc. Let b be the former arc, and c7! the latter. Then, 
both b and c are arcs of D, and the vertex v is the target of both b and c. The 


111gince the definition of e shows that 


e for each forward arc a that appears in p, we have e < c (a) — f (a) and thus f (a) +e < 
c(a); 

e for each backward arc a~! € A`! that appears in p, we have e < f (a) and thus 
f(a)—e>0. 
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definition of f’ yields that f’ (b) = f (b) + e (since p uses the forward arc b) and 
f! (c) = f (c) — e (since p uses the backward arc c~'), whereas each other arc 
a € A with target v satisfies f’ (a) = f (a). Hence, (f’) (v) = f~ (v) +e-e= 
f~ (v). Moreover, (f’)* (v) = f+ (v) (since none of the arcs of D with source 
v appears in p, nor does its reversal). Hence, (f’) (v) = f~ (v) = ft (v) = 
(f’)* (v). Thus, we have proved (f’) (v) = (f’)* (v) in Case 3. 

The other two cases are similar (Case 4 is analogous to Case 3, while Case 5 
is analogous to Case 2). Thus, altogether, we have proved (f’)” (v) = (f’)* (v) 
in all five cases. 

Forget that we fixed v. We thus have shown that each vertex v € V \ {s,t} 
satisfies (f’) (v) = (f’)* (v). In other words, the map f' satisfies the con- 
servation constraints. Since f’ also satisfies the capacity constraints, we thus 
conclude that f’ is a flow. 

What is the value |f"| of this flow? The path p starts at s, so it exits s by 
some arc y (it must have at least one arc, since s Æ t) and never comes back 
to s again. If this arc y is a forward arc b, then f'(b) = f (b) + € and therefore 
(FOT (s) = ft (s) +e and (f’)” (s) = f~ (s). If this arc y is a backward arc 
c71, then f’ (c) = f (c) — e and therefore (f’)~ (s) = f~ (s) — e and (f’)* (s) = 
f* (s). Thus, 


+e, if y is a forward arc; 


) 
—f~(s)+e, if y isa backward arc 
) +e=|flt+e. 


=|fl 
(by the definition of |f|) 


However, the definition of the value |f'| yields 


K=O) ®)-) © =lfl +e >If (since e > 0). 


In other words, the flow f’ has a larger value than f. Thus, we have found a 
flow f’ with a larger value than f. This proves Lemma [9.4.10] (a). 


(b) Assume that the digraph D; has no path from s to t. Define a subset S of 
V by 
S= {v€ V | the digraph D; has a path from s to v}. 


Then, s € S (because the trivial path (s) is a path from s to s) and t ¢ S (since we 
assumed that Dy has no path from s to t). We shall next show that |f| = c (S, S). 
Indeed, we shall obtain this from Proposition 9.3.2] (d). To do so, we will first 

show that 
(f (a) = 0 for alla € [S,S]) (47) 
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and 
(f (a) =c (a) for alla [S,S]). (48) 


[Proof of (42): Let a € [S,S]. Assume that f (a) # 0. Thus, f (a) > 0 (since 
the capacity constraints yield f (a) > 0). Hence, the backward arc a~! is an arc 
of the residual digraph D;. Let u be the source of a, and let v be the target of 
a. Since a € [S, S], we thus have u € S and v € S. From v € S, we see that the 
digraph D; has a path from s to v. Let q be this path. Appending the backward 
arc a! (which is an arc from v to u) and the vertex u to this path q (at the end), 
we obtain a walk from s to u in D¢. Hence, D f has a walk from s to u, thus also 
a path from s to u (by Corollary ED. This entails u € S (by the definition of 
S). However, this contradicts u € S = V \ S. This contradiction shows that our 
assumption (that f (a) # 0) was wrong. Therefore, f (a) = 0. This proves (47).] 

[Proof of (48): Let a € [S,S]. Assume that f (a) # c (a). Thus, f (a) < c (a) 
(since the capacity constraints yield f (a) < c (a)). Hence, the forward arc a is 
an arc of the residual digraph Dy. Let u be the source of a, and let v be the 
target of a. Since a € [S,S], we thus have u € S and v € S. From u € S, we see 
that the digraph D; has a path from s to u. Let q be this path. Appending the 
forward arc a (which is an arc from u to v) and the vertex v to this path q (at the 
end), we obtain a walk from s to v in Dy. Hence, D f has a walk from s to v, thus 
also a path from s to v (by Corollary ESD). This entails v € S (by the definition 
of S). However, this contradicts v € S = V \ S. This contradiction shows that 
our assumption (that f (a) # c(a)) was wrong. Therefore, f (a) = c (a). This 
proves (48).] 

Now, Proposition 9.3.2] (d) yields that |f| = c (S,S) holds (since and 
hold). 

We have now found a subset S of V satisfying s € S and t ¢ S and |f| = 
eG, S). In order to prove Lemma (b), it suffices to show that the flow 
f has maximum value (among all flows on N). However, this is now easy: 
Any flow g on N has value |g| < c (S,5) (by Proposition (c), applied to 
g instead of f). In other words, any flow g on N has value |g| < |f| (since 
|f| =c(S,S)). Thus, the flow f has maximum value. This completes the proof 
of Lemma [9.4.10] (b). o 


9.4.6. Proof of max-flow-min-cut 


We are now ready to prove the max-flow-min-cut theorem (Theorem [9.4.3}: 


Proof of Theorem We let f : A — N be the zero flow on N (see Example 
9.1.12] for its definition). Now, we shall incrementally increase the value |f| of 
this flow by the following algorithm (known as the Ford-Fulkerson algorithm): 


1. Construct the residual digraph Dy. 
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2. If the digraph Dş has a path from s to t, then Lemma [p.4.10] (a) shows that 
the network N has a flow f’ with a larger value than f (and furthermore, 
the proof of Lemma [9.4.10] (a) shows how to find such an f’ efficiently! 2). 
Fix such an f’, and replace f by f’. Then, go back to step 1. 


3. If the digraph D f has no path from s to t, then we end the algorithm. 


The replacement of f by f’ in Step 2 of this algorithm will be called an aug- 
mentation. Thus, the algorithm proceeds by repeatedly performing augmenta- 
tions until this is no longer possible. 

I claim that the algorithm will eventually end — i.e., it cannot keep performing 
augmentations forever. Indeed, each augmentation increases the value |f| of the 
flow f, and therefore it increases this value |f| by at least 1 (because increasing 
an integer always means increasing it by at least 1). However, the value |f| 
is bounded from above by the capacity c (S,S) of an arbitrary cut [S,S] (by 
Proposition (c)), and thus cannot get increased by 1 more than c (S, S) 
many times (since its initial value is 0). Therefore, we cannot perform more 
than c (S,S) many augmentations in sequence. 

Thus, the algorithm eventually ends. Let us consider the flow f that is ob- 
tained once the algorithm has ended. This flow f has the property that the 
digraph Dy has no path from s to t. Thus, Lemma (b) shows that the 
flow f has maximum value (among all flows on N), and there exists a subset S 
of V satisfying s € S and t ¢ S and |f| = c (S,S). Consider this S. 

Since the flow f has maximum value, we have 


|f| = max {|g| | g is a flow}. 
On the other hand, for each subset T of V satisfying s € T and t ¢ T, we have 
TOE E 
(by Proposition 9.3.2](c), applied to T instead of S). Hence, 
c (S,S) = min {c (T,T) | TCV; sET; tT}. 

Comparing this with 

c(S,S) = |f| = max {|g| | gisa flow}, 
we obtain 

max {|g| | gisa flow} = min {c (T,T) | TCV; sET; tT}. 


In other words, the maximum value of a flow equals the minimum capacity of 
a cut. This proves Theorem [9.4.3] (Of course, we cannot use the letters f and S 
for the bound variables in max {|g| | g is a flow} and 

min {c (T,T) | TC V; s € T; t ¢ T}, since f and S already stand for a specific 
flow and a specific set.) EI 


1120f course, this requires an algorithm for finding a path from s to t in D f- But there are many 
efficient algorithms for this (see, e.g., homework set #4 exercise 5). 
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Remark 9.4.11. All the theorems, propositions and lemmas we 
proved in this chapter still hold if we replace the set IN by the 
set Q+ := {nonnegative rational numbers} or the set R+ := 
{nonnegative real numbers}. However, their proofs get more compli- 
cated. The problem is that if the arc flows of f belong to Q+ or R+ rather 
than N, it is possible for |f| to increase endlessly (cf. Zeno’s paradox of 
Achilles and the tortoise), as we make smaller and smaller improvements to 
our flow but never achieve (or even approach!) the maximum value. 

With rational values, this fortunately cannot happen, since the lowest com- 
mon denominator of all arc flows f (a) does not change when we perform 
an augmentation. (To put it differently: The case of rational values can be 
reduced to the case of integer values by multiplying through with the low- 
est common denominator.) With real values, however, this misbehavior can 
occur (see §I.8] for an example). Fortunately, there is a way to 
avoid it by choosing a shortest path from s to t in Dy at each step. This is 
known as the Edmonds-Karp version of the Ford-Fulkerson algorithm (or, 
for short, the Edmonds-Karp algorithm). Proving that it works takes a bit 
more work, which we won't do here (see, e.g., Theorem 4.4]). Inci- 
dentally, this technique also helps keep the algorithm fast for integer-valued 


flows (running time O (ivi . Al). 


9.5. Application: Deriving Hall—-König 


Now, let us apply the max-flow-min-cut theorem to prove the Hall-König 
matching theorem (8.4.7): 


Proof of Theorem (sketched). (This is an outline; see proof of Lemma 
1.42] for details!!!) As explained in Example we can turn the bipartite 
graph (G, X,Y) into a network so that the matchings of G become the flows f 
of this network. The max-flow-min-cut theorem (Theorem [9.4.3) yields that 


max {|f| | f isa flow} = min{c(S,S) | SCV;seS;t¢S}, 


where V is the vertex set of the digraph that underlies our network. Thus, there 
exist a flow f and a cut [S, S] of this network such that |f| = c (S, S). Consider 
these f and S. Thus, S is a subset of V such that s € S and t ¢ S. 

Let M be the matching of G corresponding to the flow f (that is, we let M be 
the set of all edges e of G such that f () = 1). Thus, |M| = |f]. 


13Note that Lemma 1.42] is stated only for a simple graph G, not for a multigraph 
G. However, this really makes no difference here: If (G, X,Y) is a bipartite graph with G 
being a multigraph, then (Gane, X, Y) is a bipartite graph as well, and clearly any matching 
of GMP yields a matching of G having the same size (and the set N (U) does not change 
from G to Gsimp either). Thus, in proving Theorem|8.4.7} we can WLOG assume that G is a 
simple graph. 


An introduction to graph theory, version August 2, 2023 page 360 


Let U := XN S. Then, U is a subset of X. Here is an illustration of the cut 
[S,S] on a simple example (the flow f is not shown): 


(the orange oval is the set U). 
Now, we have 


IM| = |f| =c (S, S) = c ({s}, S) +c (zos 5) +c (YNS, 5) 
y =u y 


=|x\u| -YASI 
(why?) (why?) 
(since S is the union of the disjoint sets {s}, X N S and YN S) 
= |X \ u| + c (U, S)+|Y ASI] 
TE 
=|x|-|u| >|N(U)| 
(since each vertex y€N(U) either belongs to YNS 
and thus contributes to |YMS|, or belongs to S 
and thus contributes to c(U, S)) 
> |X| — |u| + |N (U)| = |N (U)| + |X| — |u]. 
This proves Theorem [8.4.7] oO 


Having proved the Hall-König matching theorem (Theorem [8.4.7), we have 
thus completed the proofs of Hall’s marriage theorem (Theorem [8.3.4) and of 
König’s theorem (Theorem |8.4.6) as well, because we already know how to 
derive the latter two theorems from the former. 


9.6. Other applications 


Further applications of the max-flow-min-cut theorem include: 
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e A curious fact about rounding matrix entries (stated in terms of a digraph 
in Exercise 4.13]): Let A be an m x n-matrix with real entries. 
Assume that all row sumg!!4] of A and all column sumd!15 of A are in- 
tegers. Then, we can round each non-integer entry of A (that is, replace 
it either by the next-smaller integer or the next-larger integer) in such a 
way that the resulting matrix has the same row sums as A and the same 
column sums as A. 


e An Euler-Hierholzer-like criterion for the existence of an Eulerian circuit 
in a “mixed graph” (a general notion of a graph that can contain both 


undirected edges and directed arcs) [|ForFul74, §II.7]. 


e A proof §6.3] of the Erdés—Gallai theorem, which states that for 
a given weakly decreasing n-tuple (dı > d2 >--- > dn) of nonnegative 
integers, there exists a simple graph with n vertices whose n vertices have 
degrees d1,d2,...,dy if and only if the sum dı + d2 +---+d, is even and 
each i € {1,2,...,n} satisfies 


k n 
yo dj <k(k-1)+ > min {dj,k}. 
i=1 i=k+1 


(The “only if” part of this theorem was Exercise [2.6]= Exercise 6 on home- 
work set #2.) 


The following exercise can be solved both with and without using the max- 
flow-min-cut theorem; it should make good practice to solve it in both ways. 


Exercise 9.2. Consider a network consisting of a multidigraph D = (V, A, 4), 
a source s € V and a sink t € V, and a capacity function c : A — IN such that 
s Æ t. (You can replace IN by Q, or R+ here.) 

An s-t-cutting subset shall mean a subset S of V satisfying s € S and t ¢ S. 

Let m denote the minimum possible value of c (S,S) where S ranges over 
the s-t-cutting subsets. (Recall that this is the maximum value of a flow, 
according to Theorem 9.4.3}) 

An s-t-cutting subset S is said to be cut-minimal if it satisfies c (S,S) = m. 

Let X and Y be two cut-minimal s-t-cutting subsets. Prove that X N Y and 
X UY also are cut-minimal s-t-cutting subsets. 


[Solution: This is Exercise 7 on homework set #5 from my Spring 2017 
course (except that the simple digraph has been replaced by a multidigraph); 


see the course page for solutions. ] 


1144 row sum of a matrix means the sum of all entries in some row of this matrix. Thus, an 
m x n-matrix has m row sums. 

115 4 column sum of a matrix means the sum of all entries in some column of this matrix. Thus, 
an m x n-matrix has n column sums. 
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10. More about paths 


In this chapter, we will learn a few more things about paths in graphs and 
digraphs. 


10.1. Menger’s theorems 


We begin with a series of fundamental results known as Menger’s theorems 
(named after Karl Menger, who discovered one of them in 1927 as an auxiliary 
result in a topological study of curved). 

Imagine you have 4 different ways to get from Philadelphia to NYC, all using 
different roads (i.e., no piece of road is used by more than one of your 4 ways). 
Then, if 3 arbitrary roads get blocked, then you still have a way to get to NYC. 

This is obvious (indeed, each blocked road destroys at most one of your 4 
paths, so you still have at least one path left undisturbed after 3 roads have 
been blocked). A more interesting question is the converse: If the road network 
is sufficiently robust that blocking 3 arbitrary roads will not disconnect you 
from NYC, does this mean that you can find 4 different ways to NYC all using 
different roads? 

Menger’s theorems answer this question (and various questions of this kind) 
in the positive, in several different setups. Each of these theorems can be 
roughly described as “the maximum number of pairwise independent paths 
from some place to another place equals the minimum size of a bottleneck that 
separates the former from the latter”. Here, the “places” can be vertices or sets 
of vertices; the word “independent” can mean “having no arcs in common” 
or “having no intermediate vertices in common” or “having no vertices at all 
in common”; and the word “bottleneck” can mean a set of arcs or of vertices 
whose removal would disconnect the former place from the latter. Here is a 
quick overview of all Menger’s theorems that we will provel!!7| 


e for directed graphs: 


the places are ... the paths must be ... the bottleneck consists of ... 


116See [Schrij03} §9.6e] for more about its history. 


117 All undefined terminology used here will be defined further below. 
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e for undirected graphs: 


the paths must be ... the bottleneck consists of ... 


10.1.42 internally vertex-disjoint vertices € V \ {s,t} 
10.1.44 internally vertex-disjoint vertices € V \ (XUY) 


(I could state more, but I don’t want this to go on forever.) 


10.1.1. The arc-Menger theorem for directed graphs 


We begin with the most natural setup: a directed graph (one-way roads) with 
roads being arcs. The following definitions will help keep the theorems short: 


Definition 10.1.1. Two walks p and q in a digraph are said to be arc-disjoint 
if they have no arc in common. 


Example 10.1.2. The following picture shows two arc-disjoint paths p and q 


(they can be told apart by their labels: each arc of p is labelled with a “p”, 
and likewise for q): 


The following picture shows two paths r and s that are not arc-disjoint (the 
common arc is marked with “r,s” 


r 


Definition 10.1.3. Let D = (V, A, y) be a multidigraph, and let s and t be 
two vertices of D. A subset B of A is said to be an s-t-arc-separator if each 
path from s to t contains at least one arc from B. Equivalently, a subset B of 


A is said to be an s-t-arc-separator if the multidigraph (v, A\B, y| ap) 
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has no path from s to t (in other words, removing from D all arcs contained 
in B destroys all paths from s to t). 


Example 10.1.4. Let D = (V, A, p) be the following multidigraph: 


Then, the set {«, y} is not an s-t-arc-separator (since the path drawn in blue 
contains no arc from this set). However, the set {6,7} is an s-t-arc-separator, 
and so is the set {0,e}. Of course, any set that contains any of {f,y} and 
{6,€} as a subset is therefore an s-t-arc-separator as well. 


Example 10.1.5. Let D be a multidigraph. Let s and t be two vertices of D. 
Then, the empty set © is an s-t-arc-separator if and only if D has no path 
from s to t. This degenerate case should not be forgotten! 


We can now state the first Menger’s theorem: 


Theorem 10.1.6 (arc-Menger theorem for directed graphs, version 1). Let D = 
(V,A,w) be a multidigraph, and let s and t be two distinct vertices of D. 
Then, the maximum number of pairwise arc-disjoint paths from s to t equals 
the minimum size of an s-t-arc-separator. 


the minimum size of an s-t-arc-separator is 2 (indeed, {6, y} is an s-t-arc- 
separator of size 2, and it is easy to see that there are no s-t-arc-separators 
of smaller size). Hence, Theorem [10.1.6] yields that the maximum number of 
pairwise arc-disjoint paths from s to t is 2 as well. And indeed, we can easily 
find 2 arc-disjoint paths from s to t, namely the red and the blue paths in the 


Example 10.1.7. Let D be the multidigraph from Example {10.1.4} Then, 
following figure: 
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Before proving Theorem {10.1.6} let me state another variant of this theorem, 
which is closer to the proof. First, some notations: 


Definition 10.1.8. Let D = (V, A, y) be a multidigraph, and let s and t be 
two distinct vertices of D. 


(a) For each subset S of V, we set S := V \ S and 


[S,S] := {a € A | the source of a belongs to S, 
and the target of a belongs to S}. 


(These are the same definitions that we introduced for networks in Def- 


inition ) 


(b) An s-t-cut means a subset of A that has the form [S, $h; where S is a 
subset of V that satisfies s € S and t ¢ S. (This was just called a “cut” 
back in Definition (a).) 


An s-t-cut is called this way because its removal would cut the vertex s from 
the vertex t. More precisely: 


Remark 10.1.9. Let D = (V, A, y) be a multidigraph, and let s and t be two 
distinct vertices of D. Then, any s-t-cut is an s-t-arc-separator. 


Proof. Let B be an s-t-cut. We must prove that B is an s-t-arc-separator. In other 
words, we must prove that each path from s to t contains at least one arc from 
B. 

We know that B is an s-t-cut. In other words, B = [S6]; where S is a subset 
of V that satisfies s € S and t ¢ S. Consider this subset S. 

Each path from s to ¢ starts at a vertex in S (since s € S) and ends at a 
vertex outside of S (since t ¢ S). Thus, each such path has to escape the set 
S at some point — i.e., it must contain an arc whose source is in S and whose 
target is outside of S. But such an arc must necessarily belong to [5,5] (by 
the definition of [S, S] ). Thus, each path from s to t must contain an arc from 
[5,5]. In other words, each path from s to t must contain an arc from B (since 
B = [S,S]). In other words, B is an s-t-arc-separator (by the definition of an 
s-t-arc-separator). This proves Remark [10.1.9] O 


Theorem 10.1.10 (arc-Menger theorem for directed graphs, version 2). Let 
D = (V, A, 4) be a multidigraph, and let s and t be two distinct vertices of 
D. Then, the maximum number of pairwise arc-disjoint paths from s to t 
equals the minimum size of an s-t-cut. 
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Example 10.1.11. Let D be the following multidigraph: 


O CQ I 
Aa 


Then, the maximum number of pairwise arc-disjoint paths from s to t is 3. 
Indeed, the following picture shows 3 such paths in red, blue and brown, 
respectively: 


More than 3 pairwise arc-disjoint paths from s to t cannot exist in D, since 
(e.g.) there are only 3 arcs outgoing from s. 

By Theorem this shows that the minimum size of an s-t-cut in D 
is 3 as well. There are many s-t-cuts of size 3 (for instance, the “obvious” cut 


{s} 15} | has this property, as does the s-t-cut {s a,f},{s, a, FY ). 
Let us now reverse of the direction of the arc from c to e in D (thus de- 
stroying the brown path). The resulting multidigraph D’ looks as follows: 
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This digraph D’ has no more than 2 pairwise arc-disjoint paths from s to t. 
This can be seen by observing that the s-t-cut |{s,c}, {erc} has size 2 (it 


consists of the arc from s to a and the arc from s to b), so that the minimum 
size of an s-t-cut is at most 2, and therefore (by Theorem [10.1.10) the maxi- 
mum number of pairwise arc-disjoint paths from s to t is at most 2 as well. It 
is easy to see that the latter number is exactly 2 (since our red and blue paths 
still exist in D’). 


To prove the above two arc-Menger theorems, we need one more lemma 
about networks. We recall the notations from Section and from Definition 
and introduce a couple more: 


Definition 10.1.12. Let D = (V, A, y) be a multidigraph. Let f,g: A > N 
be two maps. Then: 


(a) We let f + g denote the map from A to N that sends each arc a € A to 
f (a) +g (a). (This is the pointwise sum of f and g.) 


(b) We write g < f if and only if each arc a € A satisfies g (a) < f (a). 


(c) If g < f, then we let f — g denote the map from A to IN that sends each 
arc a € A to f (a) — g(a). (This is really a map to N, since g < f entails 


g(a) < f (a).) 


These notations satisfy the properties that you’d expect: e.g., the pointwise 
sum of maps from A to N is associative (meaning that (f+¢) +h = f+ 
(¢ +h), so that you can write f + g +h for both sides); inequalities can be 
manipulated in the usual way (e.g., we have f — g < h if and only if f < g +h). 
Verifying this all is straightforward. 

The following definition codifies the flows that we constructed in Remark 
9.1.9 


Definition 10.1.13. Let N be a network consisting of a multidigraph D = 
(V, A, 4), a source s € V, a sink t € V and a capacity function c : A > N. 
Let p be a path from s to t in D. Then, we define a map fp : A > N by 
setting 


1, ifai f p; 
fale) = Ve a for eacha € A. 


0, otherwise 


We call this map fp the path flow of p. It is an actual flow of value 1 if all 
the arcs of p have capacity > 1. 
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Example 10.1.14. Consider the following network: 


where each arc has capacity 1. Then, the path p = (s,2,6,t) leads to the 
following path flow fp: 


J 


Here, in order not to crowd the picture, we have left out the “of 1” part of 
the label of each arc (so you should read the “0”s and the “1”s atop the arcs 
as “0 of 1” and “1 of 1”, respectively). 


The path flow thus turns any path from s to t in a network into a flow, 
provided that the arcs have enough capacity to carry this flow. If we have m 
paths p1,p2,-.--,Pm from s to t, then we can add their path flows together, 
and obtain a flow fp, + fp. +--+ + fp, of value m, provided (again) that the 
arcs have enough capacity for it. (In general, we cannot uniquely reconstruct 
P1, P2,---,Pm back from this latter flow, as they might have gotten “mixed to- 
gether”.) 

Our next lemma can be viewed as a (partial) converse of this observation: 
Any flow f of value m “contains” a sum fp, + fp. +--+ fp,, of m path flows 
fps Spo» +++ fpm Corresponding to m (not necessarily distinct) paths p1, p2,..., Pm 
from s to t. Here, the word “contains” signals that f is not necessarily equal to 
foi + fps + +++ + fm, but only satisfies fp, + fp. +*+ fp, < f in general. So 
here is the lemma: 
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Lemma 10.1.15 (flow path decomposition lemma). Let N be a network con- 
sisting of a multidigraph D = (V, A, y), a source s € V, a sink t € V anda 
capacity function c : A — N. Let f be a flow on N that has value m. Then, 
there exist m paths pj, p2,..., Pm from s to t in D such that 


fpi + fpo +: + fpn <f. 


Proof. We induct on m. 

The base case (m = 0) is obvious (since the empty sum fp, + fpo +--+ + fpm is 
the zero flow, and thus is < f because of the capacity constraints). 

Induction step: Let m be a positive integer. Assume (as the induction hypoth- 
esis) that the lemma holds for m — 1. We must prove the lemma for m. 

So we consider a flow f on N that has value m. We need to show that there 
exist m paths p1, P2,- - -, Pm from s to t in D such that fp, + fp. +-->+ fom < f- 

We shall first find some path p from s to t such that fp < f. 

We shall refer to the arcs a € A satisfying f (a) > 0 as the active arcs. Let 
A' := {a € A | f (a) > 0} be the set of these active arcs. Consider the spanning 
subdigraph D’ := (V, A',y4 |x) of D. 

Let S be the set of all vertices v € V such that D’ has a path from s to v. Then, 
s € S (since the trivial path (s) is a path of D’). 

We next claim that each arc b € [S,S] satisfies f (b) = 0. 

[Proof: Assume the contrary. Thus, some arc b € [5,5] satisfies f (b) # 0. 
Consider this b. From f (b) # 0, we obtain f (b) > 0 (since f is a flow), thus 
b € A’ (by the definition of A’). Hence, b is an arc of D’ (by the definition of 
D’). 

Let u be the source of the arc b, and v its target. Since b € [s, S] , we therefore 
have u € S and v € S. Since u € S, the digraph D’ has a path p from s to u (by 
the definition of S). Consider this path p. Appending the arc b and the vertex v 
at the end of this path p, we obtain a walk from s to v in D’ (since b is an arc of 
D’ with source u and target v). Hence, the digraph D’ has a walk from s to v, 
thus also a path from s to v (by Corollary 4.5.8). This means that v € S (by the 
definition of S). But this contradicts v € S = V \ S. This contradiction shows 
that our assumption was wrong, qed.] 

We thus have proved that each b € [S,S] satisfies f(b) = 0. Therefore, 
f (S,S) = 0 (using the notations of Definition (d)). However, recall that 
s € S. Thus, if we had t ¢ S, then Proposition 9.3.2] (b) would yield 


Ifl =f (8,5) —f (§,S) <0-0=0, 
=0 >0 


which would contradict |f| = m > 0. Hence, we must have t € S. In other 
words, the digraph D’ has a path from s to t (by the definition of S). Let p be 
this path. Then, p is also a path in D and satisfies fp < f [18 Therefore, 


118 Proof. We need to prove that each arc a € A satisfies fp (a) < f (a). 
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f — fp isa map from A to IN. Moreover, f — fp is again a flow! and has value 
lf—fp| =m-1 Thus, by the induction hypothesis, we can apply Lemma 
[10.1.15] to m — 1 and f — fp instead of m and f. As a result, we conclude that 
there exist m — 1 paths p1,p2,...,Pm—1 from s to t in D such that fp, + fp, + 
“> + fp, 1 < f— fp- Consider these m — 1 paths py,p2,...,Pm—1, and set 
Pm := p. Then, fp, + fpo t't: + fon < f fp = f — fpm (since p = pm), so 
that fp, + fpo +°: + fon < f 

Thus, we have found m paths pj, P2,- -., Pm from s to t in D such that fp, + 
foo +: +: + fom < f- But this is precisely what we wanted. Thus, the induction 
step is complete, and Lemma [10.1.15]is proved. O 


Remark 10.1.16. There exists an alternative proof of Lemma |10.1.15} which 
is too nice to leave unmentioned. Here is a quick outline: Consider a new 
multidigraph that is obtained from D by replacing each arc a by f (a) many 
parallel arcs (if f (a) = 0, this means that a is simply removed). Add m 
many arcs from t to s to this new multidigraph. The resulting digraph is 
balanced (because of the conservation constraints for f). It may fail to be 
weakly connected; however, the vertices s and t belong to the same weak 
component of it (as long as m > 0). Hence, applying the directed Euler- 
Hierholzer theorem (Theorem (a)) to this component, we see that this 
component has an Eulerian circuit. Cutting the m arcs from t to s out of this 
circuit, we obtain m arc-disjoint walks from s to t. Each of these m walks 
contains some path from s to t, and thus we obtain m paths p1, p2,...,Pm 
from s to t in D such that fp, + fpo +--+ + fon < f- 


Remark 10.1.17. Let N be a network consisting of a multidigraph D = 
(V,A,), a source s € V, a sink t € V and a capacity function c : A > N. If 
c is a cycle of D, then we can define a map fe : A —> N by setting 


1, ifai fc; 
fel) =| fe ee for each a € A. 


0, otherwise 


We call this map fe the cycle flow of c. It is an actual flow of value 0 if all the 
arcs of c have capacity > 1. 


So let a € A be an arc. If a is not an arc of p, then the definition of fp yields fp (a) = 0 < 
f (a) (since f is a flow), so we are done in this case. Hence, assume WLOG that a is an arc of 
p. Thus, a is an arc of D’ (since p is a path of D’). In other words, a € A’. By the definition 
of A’, this means that f (a) > 0. Since f (a) is an integer, we thus have f (a) > 1 = fp (a) 
(since a is an arc of p). In other words, fp (a) < f (a). This is precisely what we wanted to 
prove. 
119Here, we are using the fact (which is straightforward to prove) that if g and h are two flows 
with h < g, then g — h is again a flow. 
120Here, we are using the fact (which is straightforward to prove) that if g and h are two flows 
satisfying h < g, then |g — h| = |g| — |h|. Applying this fact to g = f and h = fp, we obtain 
lf - fel = Ifl -|f| =m-1. 
eed 
=m =j 
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Now, the conclusion of Lemma [10.1.15|can be improved as follows: There 
exist m paths p1, Pz, ---, Pm from s to t in D as well as a (possibly empty) 
collection of cycles c1, ¢2,...,¢% of D such that 


f= (p e foe) Get eg st fy) 


Proving this improved claim is a bit harder than proving Lemma [10.1.15 
but not by too much (in particular, the argument in Remark can be 
adapted, since a walk becomes a path if we successively remove all cycles 
from it). 


Proof of Theorem [10.1.10) We make D into a network N (with source s and sink 
t) by assigning the capacity 1 to each arc a € A. Clearly, a cut of this network 
is the same as what we call an s-t-cut. Moreover, the capacity c (S, S) of a cut 
[S,S] is simply the size of this cut (since each arc has capacity 1). 

The max-flow-min-cut theorem (Theorem tells us that the maximum 
value of a flow equals the minimum capacity of a cut, i.e., the minimum size of 
an s-t-cut (because, as we just explained, a cut is the same as an s-t-cut, and its 
capacity is simply its size). It thus remains to show that the maximum value of 
a flow is the maximum number of pairwise arc-disjoint paths from s to t. But 
this is easy by now: 


e If you have a flow f of value m, then you can find m pairwise arc-disjoint 
paths from s to t (because Lemma[L0.1.15|gives you m paths p1, po,.--, Pin 
such that fp, + fp. +--:+fpn < f, and the latter inequality tells you that 
these m paths pı, p2,...,Pm are arc-disjoint2}p. Thus, 


(the maximum number of pairwise arc-disjoint paths from s to f) 
> (the maximum value of a flow). (49) 


121 Proof. Assume the contrary. Thus, these m paths are not arc-disjoint. In other words, there 
exists an arc a that is used by two paths p; and p; with i # j. Consider this arc a and 
the corresponding indices i and j. Since a is used by pj, we have fp, (a) = 1. Likewise, 
fp; (4) = 1. However, fp, + fpo ++°>+ fom < f, so that 


(fp: t fpa + + fom) (a) < f (a) < c (a) (by the capacity constraints) 
=1 (since each arc has capacity 1). 
Thus, 
1> (fpi + fpa +: + fon) (4) = fpr (4) + fips (4) + +++ + fom (2) 
since fp; (a) and fp, (4) are two distinct addends 


> fp; (a) + fp; (a) of the sum fp, (a) + fp: (4) +++ + fom (4) 
SAPEN GELU Y, (because i Æ j), and since all the remaining 
= =l addends are > 0 (since fp (a) > 0 for each path p) 


=1+4+1>1, 


which is absurd. This contradiction shows that our assumption was false, qed. 


An introduction to graph theory, version August 2, 2023 page 372 


e Conversely, if you have m pairwise arc-disjoint paths p1, p2,...,Pm from 
s to t, then you obtain a flow of value m (namely, fp, + fp +++: + fom is 
such a flo 122), Thus, 


(the maximum value of a flow) 


> (the maximum number of pairwise arc-disjoint paths from s to f) . 


Combining this last inequality with (49), we obtain 


(the maximum number of pairwise arc-disjoint paths from s to t) 
= (the maximum value of a flow) 


= (the minimum size of an s-t-cut) (as we have proved before) . 


Thus, Theorem [10.1.10]is proved. o 
Theorem [10.1.10] can also be proved without using network flows (see, e.g., 


ISchrij17, Corollary 4.1b] for such a proof). 


Proof of Theorem [10.1.6] Let x denote the maximum number of pairwise arc- 
disjoint paths from s to t. 

Let n, denote the minimum size of an s-t-cut. 

Let n; denote the minimum size of an s-t-arc-separator|!23| 


122Proof. First, we observe that the map fp, + fp. +--:+fpm satisfies the conservation con- 
straints (because it is the sum of the functions fp,, fp.,---,fpn, each of which satisfies the 
conservation constraints). Let us now check that it satisfies the capacity constraints. 
Indeed, let a € A be an arc. Then, a belongs to at most one of the m paths pj, p2,...,Pm 
(since these m paths are arc-disjoint). In other words, at most one of the m numbers 
fr: (4), fpa (4), «+++ fpm (4) equals 1; all the remaining numbers equal 0. Hence, the sum 
fpi (4) + fp: (4) ++: + fpm (4) of these m numbers equals either 1 or 0; in either case, we 
thus have fp, (a) + fps (4) +--+: + fpm (4) € {0,1}. Now, 


(fer t+ fp2 + +> + fpm) (4) = fpr (4) + fpo (4) +++ + fom (4) € {0,1}, 


so that 
0 < (fp: + fp. Ts + frm) (a) < 1= c (a) 

(since each arc has capacity 1). Since we have proved this for each arc a € A, we thus have 
shown that the map fp, + fp. +--+: + fpn satisfies the capacity constraints. Hence, this map 
is a flow (since it also satisfies the conservation constraints). 

It remains to show that the value of this flow is m. But this is easy: For any flows 
81, 82,++ +, Sk, We have |g1 + Go +--+ Bl = [81] + |82| +-+: + [gx] (this is straightforward 
to see from the definition of value). Thus, 


m 


m 
=} |=} =m. 
S 


fpi + fpa +: + fpn = lfp + |fpo| ++ [fom 


k=1 


In other words, the value of the flow fp, + fp. +° ° + fp, is m. 


Hah ” 


1237f you are wondering why we chose the baroque notations “x”, “n” and “ns” for these 


“ad PL 


three numbers: The letter “x” appears in “maximum”, whereas the letter “n” appears in 


“minimum”. The subscripts “c” and should be reasonably clear. 


HI 
S 


An introduction to graph theory, version August 2, 2023 page 373 


Theorem [10.1.10] says that x = nc. Our goal is to prove that x = ns. 

Remark(10.1.9] shows that any s-t-cut is an s-t-arc-separator. Thus, ns < ne. 

The inequality x < n; follows easily from the pigeonhole principld!24, Com- 
bining this with ns < ne = x (since x = ne), we obtain x = ns. Thus, Theorem 
is proved. oO 


Exercise 10.1. Let D be a balanced multidigraph. Let s and t be two vertices 
of D. Let k € IN. Assume that D has k pairwise arc-disjoint paths from s to 
t. Show that D has k pairwise arc-disjoint paths from t to s. 


Exercise 10.2. Let D be a multidigraph. Let k € N. Let u, v and w be 
three vertices of D. Assume that there exist k arc-disjoint paths from u to v. 
Assume furthermore that there exist k arc-disjoint paths from v to w. 

Prove that there exist k arc-disjoint paths from u to w. 

[Note: If u = w, then the trivial path (u) counts as being arc-disjoint from 
itself (so in this case, there exist arbitrarily many arc-disjoint paths from u to 


w).] 


[Solution: This is Exercise 3 on midterm #2 from my Spring 2017 course 
(except that it is stated for multidigraphs instead of simple digraphs); see the 
course page for solutions. ] 


We can also extend the arc-Menger theorem to paths between different pairs 
of vertices: 


Theorem 10.1.18 (arc-Menger theorem for directed graphs, multi-terminal 
version). Let D = (V, A, y) be a multidigraph, and let X and Y be two dis- 
joint subsets of V. 

A path from X to Y shall mean a path whose starting point belongs to X 
and whose ending point belongs to Y. 

An X-Y-cut shall mean a subset of A that has the form [5,8] , Where Sisa 
subset of V that satisfies X C S and Y C S. 

Then, the maximum number of pairwise arc-disjoint paths from X to Y 
equals the minimum size of an X-Y-cut. 


124 Proof. We know that there exist x pairwise arc-disjoint paths from s to t (by the definition of 
x). Let p1, P2- - -, Px be these x paths. 

We know that there exists an s-t-arc-separator of size ns (by the definition of ns). Let B 
be this s-t-arc-separator. Thus, each path from s to t contains at least one arc from B (by 
the definition of an s-t-arc-separator). Hence, in particular, each of the x paths pj, p2,..., Px 
contains at least one arc from B. These altogether x arcs must be distinct (since the x paths 
P1 P2- --, Px are arc-disjoint); thus, we have found at least x arcs that belong to B. This 
shows that |B| > x. However, B has size ns; in other words, we have |B| = ns. Thus, 
ns = |B| > x, so that x < ns. 
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Example 10.1.19. Here is an example of a digraph D = (V,A,w), with two 
disjoint subsets X and Y of V drawn as ovals: 


In this digraph D, the maximum number of pairwise arc-disjoint paths from 
X to Y is 2; here are two such paths (marked in red and blue): 


According to Theorem |10.1.18) the minimum size of an X-Y-cut must thus 
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also be 2. And indeed, here is such an X-Y-cut: 


Proof of Theorem [10.1.18] We transform our digraph D = (V,A,w) into a new 
multidigraph D’ = (V’, A’, y") as follows: 


e We replace all the vertices in X by a single (new) vertex s, and replace all 


the vertices in Y by a single (new) vertex t. (Thus, formally speaking, we 
set V’ = (V \ (X UY)) U {s,t}, where s and t are two objects not in V.) 


For any vertex p € V, we define a vertex p’ € V’ by 


s, wpe x; 
p' = 4t, wpe y; 
p, otherwise. 


We refer to this vertex p’ as the projection of p. 


We keep all the arcs of D around, but we replace all their endpoints (i.e., 
sources and targets) by their projections (thus, any endpoint in X gets 
replaced by s, and any endpoint in Y gets replaced by t, while an endpoint 
that belongs neither to X nor to Y stays unchanged). For example, an arc 
with source in X becomes an arc with source in s. (Formally speaking, 
this means the following: We set A’ = A and we define the map y : 
A’ + V' x V' as follows: For any a € A’ = A, we set y’ (a) = (u',v’), 
where (u,v) =  (a).) 


For instance, if D is the digraph from Example [10.1.19] then D’ looks as 
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follows: 


Now, Theorem [i0.1.10] (applied to D’ = (V’, A’, ip’) instead of D = (V, A, 4)) 
shows that the maximum number of pairwise arc-disjoint paths from s to t in 
D’ equals the minimum size of an s-t-cut in D’. 

Let us now connect this with the claim that we want to prove. It is easy to 
see that the minimum size of an s-t-cut in D’ equals the minimum size of an 
X-Y-cut in D (indeed, the s-t-cuts in D’ are precisely the X-Y-cuts in D B5). 
If we can also show that the maximum number of pairwise arc-disjoint paths 
from s to t in D’ equals the maximum number of pairwise arc-disjoint paths 
from X to Y in D, then the result of the preceding paragraph will thus become 
the claim of Theorem [10.1.18 so we will be done. 

So how can we show that the maximum number of pairwise arc-disjoint 
paths from s to t in D’ equals the maximum number of pairwise arc-disjoint 
paths from X to Y in D ? It would be easy if there was a well-behaved bijection 
between the former paths and the latter paths that preserves the arcs of any 
path, but this is not quite the case. Each path from X to Y in D becomes a walk 
from s to t in D’ if we replace each of its vertices by its projection. However, the 
latter walk is not necessarily a path, since different vertices can have the same 
projection. 

Fortunately, this is easy to fix. If we have k pairwise arc-disjoint paths from 
X to Y in D, then we can turn them into k pairwise arc-disjoint walks from s to 
t in D’, and then we also obtain k pairwise arc-disjoint paths from s to t in D’ 
(since any walk from s to t contains a path from s to t). Thus, 


(the maximum number of pairwise arc-disjoint paths from s to t in D’) 
> (the maximum number of pairwise arc-disjoint paths from X to Y in D). 


125Ty more detail: 


e Any s-f-cut in D’ has the form [S,5] for some subset S of V’ satisfying s € S and 
t ¢ S; it is therefore equal to the set [s al, where S’ is the subset of V given by 
S' := (S \ {s}) UX. Therefore, it is an X-Y-cut in D. 

e Conversely, any X-Y-cut in D has the form [S,5] for some subset S of V satisfying 
X C Sand Y C 5; it is therefore equal to the set [s G , Where S’ is the subset of V’ 
given by S’ := (S \ X) U {s}. Therefore, it is an s-t-cut in D’. 
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Conversely, if we have k pairwise arc-disjoint paths from s to t in D’, then 
we can “lift” these k paths back to the digraph D (preserving the arcs, and 
replacing the vertices s and t by appropriate vertices in X and Y to make them 
belong to the right arcs), and thus obtain k pairwise arc-disjoint paths from X 
to Y in D. Thus, 


(the maximum number of pairwise arc-disjoint paths from X to Y in D) 


> (the maximum number of pairwise arc-disjoint paths from s to t in D’) : 
Combining these two inequalities, we obtain 


(the maximum number of pairwise arc-disjoint paths from s to t in D') 
= (the maximum number of pairwise arc-disjoint paths from X to Y in D). 


As explained above, this completes the proof of Theorem [10.1.18 O 


10.1.2. The edge-Menger theorem for undirected graphs 


We shall now state analogues of Theorem |10.1.6}and Theorem [10.1.10) for undi- 
rected graphs. First, the unsurprising definitions: 


Definition 10.1.20. Two walks p and q ina graph are said to be edge-disjoint 
if they have no edge in common. 


Definition 10.1.21. Let G = (V, E, ọ) be a multigraph, and let s and t be two 
vertices of G. A subset B of E is said to be an s-t-edge-separator if each path 
from s to t contains at least one edge from B. Equivalently, a subset B of E is 
said to be an s-t-edge-separator if the multigraph (v, E\B, ọ | r\B) has no 
path from s to ¢ (in other words, removing from G all edges contained in B 
destroys all paths from s to t). 


Now comes the analogue of Theorem [10.1.6 


Theorem 10.1.22 (edge-Menger theorem for undirected graphs, version 1). 
Let G = (V,E, 9) be a multigraph, and let s and t be two distinct vertices of 
G. Then, the maximum number of pairwise edge-disjoint paths from s to t 
equals the minimum size of an s-t-edge-separator. 


To state the analogue of Theorem |10.1.10} we need to first adopt Definition 
10.1.8|to undirected graphs: 


Definition 10.1.23. Let G = (V, E, ọ) be a multigraph, and let s and t be two 
distinct vertices of G. 
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(a) For each subset S of V, we set S := V \ S and 


[S,S] ng = {e E E | one endpoint of e belongs to S, 
while the other belongs to $}. 


(b) An (undirected) s-f-cut means a subset of E that has the form [S, S] 
where S is a subset of V that satisfies s € S and t ¢ S. 


und’ 


The following remark is an analogue of Remark [10.1.9 


Remark 10.1.24. Let G = (V,E,q@) be a multigraph, and let s and t be two 
distinct vertices of G. Then, any (undirected) s-t-cut is an s-t-edge-separator. 


Proof. Analogous to the proof of Remark [10.1.9 O 
And here is the analogue of Theorem [10.1.10 


Theorem 10.1.25 (edge-Menger theorem for undirected graphs, version 2). 
Let G = (V,E,@) be a multigraph, and let s and t be two distinct vertices of 
G. Then, the maximum number of pairwise edge-disjoint paths from s to t 
equals the minimum size of an (undirected) s-t-cut. 


Proof of Theorem We shall not prove this from scratch, but rather derive 
this from the directed version (Theorem [10.1.10}. 

Namely, we apply Theorem [0.1.10] tq!26| D = Gbidit, We thus see that the 
maximum number of pairwise arc-disjoint paths from s to t (in Gi4it) equals 
the minimum size of an s-t-cut (in Gt"). This is similar to the claim that 
we want to prove, but not quite the same statement, because Gbidit is not G. 
To obtain the claim that we want to prove, we must prove the following two 
claims: 


Claim 1: The maximum number of pairwise arc-disjoint paths from s 
to t (in GPidir) equals the maximum number of pairwise edge-disjoint 
paths from s to t (in G). 


Claim 2: The minimum size of a directed s-t-cut!2 (in Gidin) equals 
the minimum size of an (undirected) s-f-cut (in G). 


126Recall that Gidi" is the multidigraph obtained from G by replacing each edge by two arcs in 
opposite directions. (If the edge has endpoints u and v, then one of the two arcs has source 
u and target v, while the other has source v and target u.) See Definition [4.4.2] for a formal 
definition. 

127A “directed s-t-cut” here simply means an s-t-cut in a digraph. 
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Claim 2 is very easy to verify, since the directed s-t-cuts in Gdi" are essen- 
tially the same as the undirected s-t-cuts in G 

It remains to verify Claim 1. The simplest approach is to argue that each path 
from s to t in Gid" becomes a path from s to t in G (just replace each arc of 
the path by the corresponding undirected edge). Unfortunately, this alone does 
not suffice, since two arc-disjoint paths in G'd! won’t necessarily become edge- 
disjoint paths in G. Here is an example of how this can go wrong (imagine that 
the two arcs between u and v come from the same edge of G, and the two paths 
are marked red and blue): 


(50) 


If we replace each arc by the corresponding edge here, then the two paths will 
no longer be edge-disjoint (since the edge between u and v will be used by both 
paths). 

However, this kind of situation can be averted. To do this, we let k be the 
maximum number of pairwise arc-disjoint paths from s to t in GPi4i", We now 
choose k pairwise arc-disjoint paths p1, p2,...,px from s to t in Gbidit in such 
a way that their total length (i.e., the sum of the lengths of pj, p2,...,p x) is as 
small as possible. Then, it is easy to see that these paths p1, p2,..., px become 
edge-disjoint paths in G when we replace each arc by the corresponding edge. 


[Proof: Assume the contrary. Thus, two of these paths pı, p2,...,px end up sharing 
an edge when we replace each arc by the corresponding edge. Let p; and p; be these 
two paths (where i Æ j, of course). Let e be the edge that they end up sharing, and let 
u and v be the two endpoints of e, in the order in which they appear on p;. Hence, the 
path p; uses the edge e (or, more precisely, one of the two arcs of G'd! corresponding 
to e) to get from u to v. 

Since the paths p; and pj are arc-disjoint, they cannot both use the edge e in the same 
direction (because this would mean that p; and p; share the same arc of Gbidir) Hence, 
the path p; uses the edge e to get from v to u (since the path p; uses the edge e to get 
from u to v). Hence, the paths p; and p; have the following forms: 


pi = (...,U,€1,0,...); 


pi = (..., 0,82, U,..); 


128Tn more detail: If S is a subset of V that satisfies s € S and t ¢ S, then the directed s-t-cut 
[S,S] in Gbidir and the undirected s-t-cut [S, S] und În G have the same size (because each 
edge in [S, 5] „ng corresponds to exactly one arc in [S,$]). Thus, the sizes of the directed s-t- 


cuts in GPidi" are exactly the sizes of the undirected s-t-cuts in G. In particular, the minimum 
size of a former cut equals the minimum size of a latter cut. This proves Claim 2. 
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where e1 and ez are the two arcs of Git that correspond to the edge e. Now, let us 
replace the two paths p; and p; by two new walkd!2) 


1 
Pi = she sily ses and 
the part of p; the part of p; 
before u after u 
= v 
pj = ee „U, ET 
the part of p; the part of p; 
before v after v 


These walks p; and p; are two walks from s to t, and they don’t use any arcs that 
were not already used by p; or p;. Thus, they are arc-disjoint from all of the paths 
P1,P2,---,Px except for p; and p;. Moreover, they are arc-disjoint from each other (since 
pi and pj; were arc-disjoint, and since the arcs of any path are distinct). Furthermore, 
their total length is smaller by 2 than the total length of p; and pj; (since they use all 
the arcs of p; and pj; except for e; and e2). They are not necessarily paths, but we can 
turn them into paths from s to t by successively removing cycles (as in the proof of 
Corollary [4.5.8). If we do this, we end up with two paths p/’ and p? from s to t that are 
arc-disjoint from each other and from all of the paths pi, p2,...,px except for p; and 
p;, and whose total length is smaller by at least 2 than the total length of p; and pj. 
Thus, if we replace p; and pj by these two paths p; and p? (while leaving the 
remaining k — 2 of our k paths pı, p2,...,p, unchanged), then we obtain k mutually 


129Here is an illustration: 


pi (in red) and pj (in blue): 


p; (in red) and p; (in blue): 


(The wavy arrows stand not for single arcs, but for sequences of multiple arcs.) 
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arc-disjoint paths from s to t whose total length is smaller than the total length of our 
original k paths p1, p2,- - -, Px. However, this is absurd, because we chose our original k 
pairwise arc-disjoint paths p1, p2,..., pk from s to t in such a way that their total length 
is as small as possible. This contradiction shows that our assumption was wrong. Thus, 
we have proved that the paths p1, p2,..., pk become edge-disjoint paths in G when we 
replace each arc by the corresponding edge.] 


Hence, we have found k pairwise edge-disjoint paths from s to t in G (namely, 
the k paths that are obtained from the paths p1, p2,..., px when we replace each 
arc by the corresponding edge). This shows that 


(the maximum number of pairwise edge-disjoint paths from s to t in G) 
>k 


= (the maximum number of pairwise arc-disjoint paths from s to t in a 
(by the definition of k). Conversely, we can easily see that 


(the maximum number of pairwise arc-disjoint paths from s to ¢ in chs) 

> (the maximum number of pairwise edge-disjoint paths from s to t in G) 
(since there is an obvious way to transform paths in G into paths in Ghidir Gust 
replace each edge by one of the two corresponding arcs of GPi4it), and apply- 
ing this transformation to edge-disjoint paths of G yields arc-disjoint paths of 
Gbidir) Combining these two inequalities, we obtain 

(the maximum number of pairwise arc-disjoint paths from s to ¢ in oe) 


= (the maximum number of pairwise edge-disjoint paths from s to t in G). 


This proves Claim 1. As we explained, this concludes the proof of Theorem 
10.1.25 O 


Proof of Theorem [10.1.22 This can be derived from Theorem [10.1.25]and Remark 
10.1.24| in the same way as we derived Theorem [10.1.6] from Theorem [10.1.10 
and Remark|10.1.9 o 


Exercise 10.3. Let G be a multigraph such that each vertex of G has even 
degree. Let s and t be two distinct vertices of G. Prove that the maximum 
number of pairwise edge-disjoint paths from s to t is even. 


10.1.3. The vertex-Menger theorem for directed graphs 


The Menger theorems we have seen so far have been concerned with paths not 
having arcs in common. What if we want to avoid common vertices too? 
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Definition 10.1.26. Let p be a path of some graph or digraph. Then, an 
intermediate vertex of p shall mean a vertex of p that is neither the starting 
point nor the ending point of p. 


Definition 10.1.27. Two paths p and q in a graph or digraph are said to be 
internally vertex-disjoint if they have no common intermediate vertices. 


Example 10.1.28. The two paths p and q in Example [10.1.2] are arc-disjoint, 
but not internally vertex-disjoint. 
Here are two internally vertex-disjoint paths p and q: 


P 
P P 


q 


One trivial case of internally vertex-disjoint paths is a path of length < 1: 
Namely, a path of length < 1 is internally vertex-disjoint from any path, 
including itself (since it has no intermediate vertices). 


Definition 10.1.29. Let D = (V, A, y) be a multidigraph, and let s and t be 
two vertices of D. A subset W of V \ {s,t} is said to be an internal s-t- 
vertex-separator if each path from s to t contains at least one vertex from 
W. Equivalently, a subset W of V \ {s, t} is said to be an internal s-t-vertex- 
separator if the induced subdigraph of D on the set V \ W has no path from 
s to t (in other words, removing from D all vertices contained in W destroys 
all paths from s to t). 


Example 10.1.30. Let D = (V, A, p) be the following multidigraph: 


Then, the sets {a,b} and {a,c} are internal s-t-vertex-separators (indeed, re- 
moving the vertices a and b cuts off s from the rest of the digraph, whereas 
removing the vertices a and c does the same to t), but the sets {a} and {b,c} 
are not (since the path from s to t via c and b avoids a, whereas the path from 
s to t via a avoids b and c). 
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Example 10.1.31. Let D = (V, A, y) be a multidigraph. Let s and t be two 
distinct vertices of D. Then: 


(a) The empty set Ø is an internal s-t-vertex-separator if and only if D has 
no path from s to t. 


(b) If D has no arc with source s and target t, then the set V \ {s,t} is an 
internal s-t-vertex-separator (since any path from s to t contains at least 
one intermediate vertex, and such a vertex must belong to V \ {s, f}). 


(c) If D has an arc with source s and target t, then there exists no internal s- 
t-vertex-separator (since the “direct” length-1 path from s to t contains 
no vertices besides s and f). 


Now, we state the analogue of Theorem [10.1.6]and Theorem [10.1.10)for inter- 
nally vertex-disjoint paths: 


Theorem 10.1.32 (vertex-Menger theorem for directed graphs). Let D = 
(V,A,W) be a multidigraph, and let s and t be two distinct vertices of D. 
Assume that D has no arc with source s and target t. Then, the maximum 
number of pairwise internally vertex-disjoint paths from s to t equals the 
minimum size of an internal s-t-vertex-separator. 


Example 10.1.33. Let D be the following multidigraph: 


NINI7 


Then, the maximum number of pairwise internally vertex-disjoint paths from 
s to t is 2. Indeed, the following picture shows 2 such paths in red and blue, 
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mew 
NINI 


Why can there be no 3 such paths? This is not obvious from a quick look, but 
can be easily derived from Theorem [10.1.32] Indeed, Theorem [10.1.32] yields 
that the maximum number of pairwise internally vertex-disjoint paths from 
s to t equals the minimum size of an internal s-t-vertex-separator. Thus, if 
the former number was larger than 2, then so would be the latter number. 
But this cannot be the case, since the 2-element set {a, f } is easily checked to 
be an internal s-t-vertex-separator. Hence, we see that both of these numbers 
are 2. 


respectively: 


Example 10.1.34. Consider again the digraph D from Example[10.1.11] In that 
example, we found 3 pairwise arc-disjoint paths from s to t. These 3 paths 
are not internally vertex-disjoint (in fact, the brown path has non-starting 
and non-ending vertices in common with both the red and the blue path). 
However, there do exist 3 pairwise internally vertex-disjoint paths from s to 
t. Can you find them? 


Proof of Theorem We will again derive this from the arc-Menger theorem 
(Theorem [10.1.10), applied to an appropriate multidigraph D’ = (V’, A’, y’). 

What is this multidigraph D’ ? The idea is to modify the digraph D in such 
a way that paths having a common vertex become paths having a common arc. 
The most natural way to achieve this is to “stretch out” each vertex v of D into a 
little arc. In order to do this in a systematic manner, we replace each vertex v of 
D by two distinct vertices vÍ and v’ (the notations stand for “v-in” and “v-out”, 
and we can think of v’ as the “entrance” to v while v’ is the “exit” from v) and 
an arc v? that goes from v! to v’. Any existing arc a of D becomes a new arc 
a” of D, whose source and target are specified as follows: If a has source u and 
target v, then a” will have source u° and target v. 
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Here is an example: If 


then 


(where all arcs of the form a” for a € A have been colored blue, whereas all arcs 
of the form v” for v € V have been colored red). This D’ satisfies the property 
that we want it to satisfy: For instance, the two paths 


(s,a,x,d,y,&,t) and 
(s,b,z,i,y,e, W,h, t) 


An introduction to graph theory, version August 2, 2023 page 386 


of D have the vertex y in common, so the corresponding two paths 
o „oi „i „io 1.0 goi ,i , 10 ,,0 01 4i 
(s OO EY YY 18 it) and 
o poi „i „io „0 zoi , i ,i0 ,0 01 zpi ~,10 0 poi 41 
(s°,b ZZ Z PY YoY OW, We, Wh st) 


of D' have the arc y” in common. If you think of D as a railway network with 
the vertices being train stations and the arcs being train rides, then D’ is a more 
detailed version of this network that records a change of platforms as an arc as 
well. 

Here is a formal definition of the multidigraph D’ = (V’, A’, y’) in full gen- 
erality: 


e We replace each vertex v of D by two new vertices vf and v°. We call vi an 
“in-vertex” and v’ an “out-vertex”. The vertex set of D’ will be the set 


Vix {0 | ve Vtu | ve V}. 
eed 


: ; ut-verti 
in-vertices R CEEE 


e Each arc a € A is replaced by a new arc a”, which is defined as follows: 
If the arc a € A has source u and target v, then we replace it by a new 
arc a”, which has source u° and target vt. This arc a’ will be called an 
“arc-arc” of D’ (since it originates from an arc of D). 


e For any vertex v € V of D, we introduce a new arc vt, which has source 
v' and target v’. This arc v” will be called a “vertex-are” of D’ (since it 
originates from a vertex of D). 


e The arc set of D’ will be the set 


A fa | ac Abu fo” | vev}. 
— 


the arc-arcs the vertex-arcs 
The map y : A’ > V’ x V’ is defined as we already explained: 


- For any arc-arc a” € A, we let yp’ (a”) := (u?,v'), where (u,v) = 
(a). 


- For any vertex-arc v’ € V, we let y' (v?) := (v, 0°). 


Note that D’ is something like a “bipartite digraph”: Each of its arcs goes 
either from an out-vertex to an in-vertex or vice versa. Namely, each arc-arc 
goes from an out-vertex to an in-vertex, whereas each vertex-arc goes from an 
in-vertex to an out-vertex. Thus, on any walk of D’, the arc-arcs and the vertex- 
arcs have to alternate. 
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If p = (vo, 41,01, 42, V2,.. Uy Vz) is any nontrivial) path of D, then we can 
define a corresponding path p” of D’ by 


oi. [0 „Oi „i „iO „0 „Oi „i ,,i0 ~.0 oi „i 
p“ := (og at, vi of, ogag, v5, 0,03- ., afi, af) . 


This path p“ is obtained from p by 
e replacing the starting point vo by v5; 


e replacing the ending point v; by v1; 


e replacing each other vertex v; by the sequence vi, ve, o; i 


e replacing each arc a; by a" 


Informally speaking, this simply means that we stretch out each intermediate 
vertex of p to the corresponding arc. 

If p is a path from s to t in D, then p” is a path from s° to t in D’. Conversely, 
any path from 2 to t in D' must have the form p”, where p is some path from 
s to t in D (because on any walk of D’, the arc-arcs and the vertex-arcs have to 
alternate). Therefore, the map 


{paths from s to t in D} > {paths from s° to ft! in D'} j 
pH p” 61) 


is a bijection. Moreover, two paths p and q of D are internally vertex-disjoint if 
and only if the paths p” and q” are arc-disjoint (since each vertex of a path p 
except for its starting and ending points is represented by an arc in p”). 

Now, let k be the maximum number of pairwise arc-disjoint paths from s° 
to ti in D’. Thus, D’ has k pairwise arc-disjoint paths from s° to t. Applying 
the inverse of the bijection to these k paths, we obtain k pairwise internally 
vertex-disjoint paths from s to t in D (because two paths p and q of D are 
internally vertex-disjoint if and only if the paths p” and q% are arc-disjoint). 
Hence, 


(the maximum number of pairwise internally vertex-disjoint 
paths from s to t in D) 
>k. (52) 


Our next goal is to find an internal s-t-vertex-separator W C V \ {s, t} of size 
|W| < k. 
First, we simplify our setting a bit. 


130We say that a path is nontrivial if it has length > 0. 
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A path from s to ¢ cannot contain a loop; nor can it contain an arc with 
source t and target s (since the vertices of a path must be distinct). Hence, 
we can remove such arcs (i.e., loops as well as arcs with source t and target s) 
from D without affecting the meaning of the claim we are proving. Thus, we 
WLOG assume that the digraph D has no such arcs. Since we also know (by 
assumption) that D has no arc with source s and target t, we thus conclude that 
D has no arc with source € {s,t} and target € {s,t} (because each such arc 
would either have source s and target t, or have source t and target s, or be a 
loop). In other words, each arc of D has at least one endpoint!34] distinct from 
both s and t. 

However, k is the maximum number of pairwise arc-disjoint paths from s° to 
t in D'. Therefore, by Theorem [10.1.10] (applied to D’ = (V’, A’, y"), s° and # 
instead of D = (V, A, y), s and t), this number k equals the minimum size of an 
s°-t'-cut in D’. Hence, there exists an s°-t!-cut [S,S] in D’ such that | [5,5] | =k. 
Consider this s°-t'-cut [S,S]. Since [S,S] is an 5°-t'-cut, we have S C V’ and 
s € Sandt ¢ s. 

let B= SS]; Tren Jl = |[S,S]|| = k. Moreover, it is easy to see that 
so¢B [Jandte¢B B 

To each vertex w € V’ of D’, we assign a vertex 6 (w) € V of D as follows: If 
w=v' or w =v? for some v € V, then we set $ (w) := v. In other words, $ (w) 
is the vertex v such that w € {v',v°}. We shall call £ (w) the base of the vertex 
w. 

For each arc b € B, there exists at least one endpoint w of b such that f (w) € 
V \ {s,t} We choose such an endpoint w arbitrarily, and we denote its 


131 An endpoint of an arc means a vertex that is either the source or the target of this arc. 

132 Proof, If we had s”? € [S,S], then we would have st € S and s° € S; however, s° € S would 
contradict s° € S. Thus, we cannot have s? € [S, S]. In other words, we cannot have s? € B 
(since B = [S,5]). Hence, s? ¢ B. 

133 Proof. If we had t° € [S,S], then we would have t € S and £ € S; however, tt € S would 
contradict t € 5. Thus, we cannot have t° € [S,S]. In other words, we cannot have t° € B 
(since B = [S,5]). Hence, t° ¢ B. 

134Proof: Let b € B be an arc. We must prove that there exists at least one endpoint w of b such 
that 6 (w) € V \ {s,t}. 

The arc b is either a vertex-arc or an arc-arc. Thus, we are in one of the following two 
cases: 

Case 1: The arc b is a vertex-arc. 

Case 2: The arc b is an arc-arc. 

Let us first consider Case 1. In this case, the arc b is a vertex-arc. In other words, b = v’? 
for some v € V. Consider this v. Then, v? = b € B. Hence, v Æ s (since v = s would 
entail v? = s? ¢ B, which would contradict v? € B) and v Æ t (since v = t would entail 
v? = t° ¢ B, which would contradict v? € B). Therefore, v € V \ {s,t}. Also, clearly, vi 
is an endpoint of b and satisfies $ (vi) = v € V \ {s,t}. Hence, there exists at least one 
endpoint w of b such that £ (w) € V \ {s,t} (namely, w = v'). Thus, our proof is complete 
in Case 1. 

Let us now consider Case 2. In this case, the arc b is an arc-arc. In other words, b = a° 
for some a € A. Consider this a. Now, a is an arc of D (since a € A), and thus has at least 
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base 6 (w) by B(b). We shall call 6 (b) the basepoint of the arc b. Thus, by 


definition, we have 
B(b) € V \ {s,t} for each b € B. (53) 


We let 6 (B) denote the set {6 (b) | b € B}. Clearly, |8 (B)| < |B| = k. 
Now, we claim that 


every path from s to t (in D) contains a vertex in £ (B). (54) 


[Proof of 64): Let p be a path from s to t (in D). We must prove that p contains a 
vertex in ß (B). 

Recall that we have assigned a path p” of D’ to the path p of D. The definition of 
p“ shows that the base of any vertex of p“ is a vertex of p (indeed, if vo, v1, . . -, Uk are 
the vertices of p, then the vertices of p“ are Up, vi, U1, vi, U pasig cad Oa v, and their 
respective bases are vo, V1, V1, V2, V2, . - - , Uk—1, Vk—1, Vk). 

The path p” is a path from 2 to t (since p is a path from s to t). Hence, it starts at 
a vertex in S (since s° € S) and ends at a vertex that is not in S (since t ¢ S). Thus, 
this path p“ must cross from S into S somewhere. In other words, there exists an arc b 
of p” such that the source of b belongs to S but the target of b belongs to S. Consider 
this arc b. Thus, b € [S,S] = B, so that B (b) € $ (B) (by the definition of £ (B)). Both 
endpoints of b are vertices of p” (since b is an arc of p”). 

Now, consider the basepoint £ (b) of this arc b. This basepoint £ (b) is the base of an 
endpoint of b (by the definition of £ (b)). Thus, B (b) is the base of a vertex of p” (since 
both endpoints of b are vertices of p“). Hence, £ (b) is a vertex of p (since the base of 
any vertex of p” is a vertex of p). In other words, the path p contains the vertex £ (b). 
Since 6 (b) € B(B), we thus conclude that p contains a vertex in 6 (B). This proves 
(54).] 


Now, the set $ (B) is a subset of V \ {s,t} (since (b) € V \ {s,t} for each 
b € B) and has the property that every path from s to t contains a vertex in 
B (B) (by (64)). In other words, $ (B) is a subset W C V \ {s, t} such that every 
path from s to t contains a vertex in W. In other words, £ (B) is an internal s-t- 
vertex-separator (by the definition of an “internal s-t-vertex-separator”). Thus, 


(the minimum size of an internal s-t-vertex-separator) 

< |B(B)| =k 

< (the maximum number of pairwise internally vertex-disjoint 
paths from s to t in D) 


one endpoint distinct from both s and t (since we have shown above that each arc of D has 
at least one endpoint distinct from both s and t). Let x be this endpoint. Then, x € V \ {s,t} 
(since s is distinct from both s and f). 

But x is an endpoint of a. In other words, x is either the source or the target of a. 
Hence, the arc a” of D’ either has source x° or has target x' (by the definition of a”). In 
other words, the arc b of D’ either has source x° or has target x! (since b = a”). Since 
B(x°) =x € V \ {s,t} and £ (x') =x € V \ {s,t}, we thus conclude that the arc b of D’ has 
at least one endpoint w such that 6 (w) € V \ {s,t} (namely, w = x° if b has source x°, and 
w = x' if b has target x'). This completes our proof in Case 2. 

Thus, we are done in both Cases 1 and 2, so that our proof is complete. 
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(by @2)). 
On the other hand, we have 


(the minimum size of an internal s-t-vertex-separator ) 
> (the maximum number of pairwise internally vertex-disjoint 


paths from s to t in D) 


(by the pigeonhole principld!>5). Combining this inequality with the preceding 


one, we obtain 


(the minimum size of an internal s-t-vertex-separator ) 
= (the maximum number of pairwise internally vertex-disjoint 


paths from s to t in D). 


This proves Theorem [10.1.32 O 


There is also a variant of the vertex-Menger theorem similar to what Theorem 
10.1.18|did for the arc-Menger theorem. Again, we need some notations first: 


Definition 10.1.35. Let D = (V, A, y) be a multidigraph, and let X and Y be 
two subsets of V. 


(a) A path from X to Y shall mean a path whose starting point belongs to 
X and whose ending point belongs to Y. 


135 Proof in more detail: Let n be the minimum size of an internal s-t-vertex-separator. Let x be 
the maximum number of pairwise internally vertex-disjoint paths from s to t in D. We must 
show that n > x. 

Assume the contrary. Thus, n < x. 

The definition of n shows that there exists an internal s-t-vertex-separator W that has size 
n. 

The set W is an internal s-t-vertex-separator. In other words, W is a subset of V \ {s,t} 
such that every path from s to t contains a vertex in W. Moreover, W has size n; thus, 
|Wļ|=n<x. 

The definition of x shows that there exist x pairwise internally vertex-disjoint paths from 
s to t in D. Let p1, P2 ---,px be these x paths. Each of these x paths pı, po,...,px must 
contain at least one vertex in W (since every path from s to t contains a vertex in W). Since 
|W| < x, we thus conclude by the pigeonhole principle that at least two of the x paths 
P1,P2,---,Px must contain the same vertex in W. In other words, there exist two distinct 
elements i,j € {1,2,...,x} such that p; and p; contain the same vertex in W. Let w be the 
latter vertex. Thus, w € W C V \ {s,t}. Hence, w is distinct from both s and t. Therefore, 
w is an intermediate vertex of p; (since the path p; has starting point s and ending point t). 
Likewise, w is an intermediate vertex of pj. 

However, the paths p; and pj are internally vertex-disjoint, and thus have no common 
intermediate vertex. This contradicts the fact that w is an intermediate vertex of both paths 
pj and p j This contradiction shows that our assumption was false. Hence, n > x is proved, 
qed. 
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(b) A subset W of V is said to be an X-Y-vertex-separator if each path from 
X to Y contains at least one vertex from W. Equivalently, a subset W 
of V is said to be an X-Y-vertex-separator if the induced subdigraph of 
D on the set V \ W has no path from X to Y (in other words, removing 
from D all vertices contained in W destroys all paths from X to Y). 


(c) An X-Y-vertex-separator W is said to be internal if it is a subset of 
V \ (X UY) (that is, if it is disjoint from X and from Y). 


Theorem 10.1.36 (vertex-Menger theorem for directed graphs, multi-terminal 
version 1). Let D = (V,A,w) be a multidigraph, and let X and Y be two 
disjoint subsets of V. Assume that D has no arc with source in X and target 
in Y. 

Then, the maximum number of pairwise internally vertex-disjoint paths 
from X to Y equals the minimum size of an internal X-Y-vertex-separator. 


Example 10.1.37. Let D be the following multidigraph: 


Then, the maximum number of pairwise internally vertex-disjoint paths from 
X to Y is 2; here are two such paths (drawn in red and blue): 
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(there are other choices, of course). The minimum size of an internal X-Y- 
vertex-separator is 2 as well; indeed, {a,b} is such an internal X-Y-vertex- 
separator. These two numbers are equal, just as Theorem |10.1.36] predicts. 


Proof of Theorem [10.1.36] We define a new multidigraph D’ = (V’, A’,y’) as in 
the proof of Theorem [10.1.18] Then, D’ has no arc with source s and target t 
(since D has no arc with source in X and target in Y). 

Hence, Theorem{10.1.32|(applied to D’ = (V’, A’, y’) instead of D = (V,A,w)) 
shows that the maximum number of pairwise internally vertex-disjoint paths 
from s to t in D’ equals the minimum size of an internal s-t-vertex-separator in 
D’. 

Let us now see what this result means for our original digraph D. Indeed: 


e The minimum size of an internal s-t-vertex-separator in D’ equals the min- 
imum size of an internal X-Y-vertex-separator in D (indeed, the internal 


s-t-vertex-separators in D’ are precisely the internal X-Y-vertex-separators 
inD B. 


e The maximum number of pairwise internally vertex-disjoint paths from 
s to t in D' equals the maximum number of pairwise internally vertex- 
disjoint paths from X to Y in D [84] 


136 Proof. We recall the definitions of internal s-t-vertex-separators in D’ and of internal X-Y- 
vertex-separators in D: 


- An internal s-t-vertex-separator in D’ is a subset W of V’ \ {s, t} such that each path from 
s to t contains at least one vertex from W. 


- An internal X-Y-vertex-separator in D is a subset W of V \ (XUY) such that each path 
from X to Y contains at least one vertex from W. 


These two definitions describe the same object, because of the following two reasons: 
— We have V’ \ {s,t} =V\ (XUY). 


- The paths from s to t are in bijection with the paths from X to Y (indeed, any path of the 
latter kind can be transformed into a path of the former kind by replacing the starting 
point by s and replacing the ending point by t). This bijection preserves the intermediate 
vertices (i.e., the vertices other than the starting point and the ending point). Thus, a path 
p from s to t contains at least one vertex from W if and only if the corresponding path 
from X to Y (that is, the image of p under our bijection) contains at least one vertex from 
W. 


Thus, the internal s-t-vertex-separators in D’ are precisely the internal X-Y-vertex- 
separators in D. 
137 Proof. We make the following two observations: 


Observation 1: Let k € N. If D’ has k pairwise internally vertex-disjoint paths from 
s to t, then D has k pairwise internally vertex-disjoint paths from X to Y. 


[Proof of Observation 1: Assume that D' has k pairwise internally vertex-disjoint paths from 
s to t. We can “lift” these k paths to k paths from X to Y in D (preserving the arcs, and 
replacing the vertices s and t by appropriate vertices in X and Y to make them belong to the 
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Hence, the result of the preceding paragraph is precisely the claim of Theorem 
10.1.36) and our proof is thus complete. O 


Another variant of this result can be stated for vertex-disjoint (as opposed to 


internally vertex-disjoint) paths. These are even easier to define: 


Definition 10.1.38. Two paths p and q in a graph or digraph are said to be 
vertex-disjoint if they have no common vertices. 


Theorem 10.1.39 (vertex-Menger theorem for directed graphs, multi-terminal 
version 2). Let D = (V, A, y) be a multidigraph, and let X and Y be two 
subsets of V. 

Then, the maximum number of pairwise vertex-disjoint paths from X to Y 
equals the minimum size of an X-Y-vertex-separator. 


Example 10.1.40. Let D be the following multidigraph: 


X r 


right arcs). The resulting k paths from X to Y in D are still pairwise internally vertex-disjoint 


(since our “lifting” operation has not changed the intermediate vertices of our paths). Thus, 
D has k pairwise internally vertex-disjoint paths from X to Y. This proves Observation 1.] 


Observation 2: Let k € N. If D has k pairwise internally vertex-disjoint paths from 
X to Y, then D’ has k pairwise internally vertex-disjoint paths from s to t. 


[Proof of Observation 2: Assume that D has k pairwise internally vertex-disjoint paths from 
X to Y. We can replace these k paths by k pairwise internally vertex-disjoint walks from s 
to t in D’ (by replacing their starting points with s and replacing their ending points with 
t). Thus, D’ has k pairwise internally vertex-disjoint walks from s to t. Therefore, D’ has 
k pairwise internally vertex-disjoint paths from s to t as well (since each walk contains a 
path, and of course we don’t lose internal vertex-disjointness if we restrict our walk to a 
path contained in it). This proves Observation 2.] 

Observation 2 shows that the maximum number of pairwise internally vertex-disjoint 
paths from s to t in D’ is > to the maximum number of pairwise internally vertex-disjoint 
paths from X to Y in D. But Observation 1 shows the reverse inequality (i.e., it shows that 
the former number is < to the latter number). Thus, the inequality is an equality, i.e., the 
two numbers are equal. Qed. 
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Then, the maximum number of pairwise vertex-disjoint paths from X to Y is 
2. Here are two such paths (drawn in red and blue): 


X Y 


If we were only looking for internally vertex-disjoint paths, then we could 
add a third path to these two (namely, the path that starts at the topmost 
vertex of X and ends at the topmost vertex of Y). However, this path and our 
red paths are only internally vertex-disjoint, not vertex-disjoint. A little bit 
of thought shows that D has no more than 2 vertex-disjoint paths from X to 
Y. 

The minimum size of an X-Y-vertex-separator is 2 as well; indeed, {u, y} is 
such an X-Y-vertex-separator. This number equals the maximum number of 
pairwise vertex-disjoint paths from X to Y, just as Theorem [10.1.39] predicts. 


Proof of Theorem We will reduce this to Theorem[10.1.32} again by tweak- 
ing our digraph appropriately. This time, the tweak is pretty simple: We add 
two new vertices s and t to D, and we furthermore add an arc from s to each 
x € X and an arc from each y € Y to t (thus, we add a total of |X| + |Y| new 
arcs). We denote the resulting digraph by D’. In more detail, the definition of 
D’ is as follows: 


e We introduce two new vertices s and t, and we set V’ := V U {s,t}. This 
set V’ will be the vertex set of D’. 


e For each x € X, we introduce a new arc ay, which shall have source s and 
target x. 


e For each y € Y, we introduce a new arc by, which shall have source y and 
target t. 


e We let A’ := AU {ax | x€ X}U {by | y € Y}. This set A’ will be the arc 
set of D’. 
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e We extend our map ~: A > V x V toa map y’: A’ > V' x V' by setting 
y' (ax) = (s, x) for each x € X 


and 
y (by) = (y,t) for each y € Y 
(and, of course, yy’ (c) = y (c) for each c € A). 


e We define D’ to be the multidigraph (V’, A’, y’). 


For instance, if D is the multidigraph from Example |10.1.40) then D’ looks as 
follows: 


(The arcs a, are drawn in red; the arcs by are drawn in blue.) 

By its construction, the digraph D’ has no arc with source s and target t. 
Hence, Theorem [10.1.32] (applied to D’ = (V', A’, y’) instead of D = (V, A, 4)) 
yields that the maximum number of pairwise internally vertex-disjoint paths 
from s to t equals the minimum size of an internal s-t-vertex-separator. How- 
ever, it is easy to see the following two claims: 


Claim 1: The maximum number of pairwise internally vertex-disjoint 
paths from s to t equals the maximum number of pairwise vertex- 
disjoint paths from X to Y (in D). 


Claim 2: The minimum size of an internal s-t-vertex-separator equals 
the minimum size of an X-Y-vertex-separator (in D). 


[Proof of Claim 1 (sketched): Given any path p from s to t, we can remove the 
starting point and the ending point of this path; the result will always be a path 
from X to Y (in D). Let us denote the latter path by p. Thus, we obtain a map 

{paths from s to t} — {paths from X to Y (in D)}, 
pip. 
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This map is easily seen to be a bijection (indeed, if q is a path from X to Y 
(in D), then we can easily extend it to a path p from s to t by inserting the 
appropriate arc ay at its beginning and the appropriate arc by at its end; this 
latter path p will then satisfy p = q). Moreover, two paths p and q from s to 
t are internally vertex-disjoint if and only if the corresponding paths p and q 
are vertex-disjoint (because the intermediate vertices of p are the vertices of p, 
whereas the intermediate vertices of q are the vertices of q). This proves Claim 
1.] 


[Proof of Claim 2 (sketched): It is easy to see that the internal s-t-vertex-separators 
are precisely the X-Y-vertex-separators (in D). (To show this, compare the def- 
initions of these two objects using the bijection from the proof of Claim 1, and 
observe that V’ \ {s,t} = V.) From this, Claim 2 follows.] 


Recall that the maximum number of pairwise internally vertex-disjoint paths 
from s to t equals the minimum size of an internal s-t-vertex-separator. In view 
of Claim 1 and Claim 2, we can rewrite this as follows: The maximum number 
of pairwise vertex-disjoint paths from X to Y equals the minimum size of an 
X-Y-vertex-separator. Thus, Theorem [10.1.39]is proved. oO 


We note that Hall’s Marriage Theorem (Theorem [8.3.4) can be easily derived 
from any of the directed Menger theorems (exercise!). I have heard that this 
can also be done in reverse. This places the Menger theorems in the cluster of 
theorems equivalent to Hall’s Marriage Theorem (such as K6nig’s theorem). 


10.1.4. The vertex-Menger theorem for undirected graphs 


Vertex-Menger theorems also exist for undirected graphs. Here are the undi- 
rected analogues of Theorem |10.1.32} Theorem [10.1.36] and Theorem [10.1.39 
along with the definitions they rely on: 


Definition 10.1.41. Let G = (V,E,@) be a multigraph, and let s and t be 
two vertices of G. A subset W of V \ {s,t} is said to be an internal s-t- 
vertex-separator if each path from s to t contains at least one vertex from 
W. Equivalently, a subset W of V \ {s,t} is said to be an internal s-t-vertex- 
separator if the induced subgraph of G on the set V \ W has no path from s 
to t (in other words, removing from G all vertices contained in W destroys 
all paths from s to f). 


Theorem 10.1.42 (vertex-Menger theorem for undirected graphs). Let G = 
(V,E,g) be a multigraph, and let s and t be two distinct vertices of G. 
Assume that G has no edge with endpoints s and t. Then, the maximum 
number of pairwise internally vertex-disjoint paths from s to t equals the 
minimum size of an internal s-t-vertex-separator. 
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Definition 10.1.43. Let G = (V,E,@) be a multigraph, and let X and Y be 
two subsets of V. 


(a) A path from X to Y shall mean a path whose starting point belongs to 
X and whose ending point belongs to Y. 


(b) A subset W of V is said to be an X-Y-vertex-separator if each path from 
X to Y contains at least one vertex from W. Equivalently, a subset W 
of V is said to be an X-Y-vertex-separator if the induced subgraph of 
G on the set V \ W has no path from X to Y (in other words, removing 
from G all vertices contained in W destroys all paths from X to Y). 


(c) An X-Y-vertex-separator W is said to be internal if it is a subset of 
V \ (X UY) (that is, if it is disjoint from X and from Y). 


Theorem 10.1.44 (vertex-Menger theorem for undirected graphs, multi-ter- 
minal version 1). Let G = (V, E, pọ) be a multigraph, and let X and Y be two 
disjoint subsets of V. Assume that G has no edge with one endpoint in X 
and the other endpoint in Y. 

Then, the maximum number of pairwise internally vertex-disjoint paths 
from X to Y equals the minimum size of an internal X-Y-vertex-separator. 


Theorem 10.1.45 (vertex-Menger theorem for undirected graphs, multi-ter- 
minal version 2). Let G = (V, E, pọ) be a multigraph, and let X and Y be two 
subsets of V. 

Then, the maximum number of pairwise vertex-disjoint paths from X to Y 
equals the minimum size of an X-Y-vertex-separator. 


Theorem [10.1.42} Theorem and Theorem follow immediately 
by applying the analogous theorems for directed graphs (i.e., Theorem [10.1.32} 
Theorem [10.1.36]and Theorem [I0.1.39) to the digraph GPd" instead of D (since 
the paths of G are in bijection with the paths of G4"), 


10.2. The Gallai-Milgram theorem 

Next, we proceed to some more obscure properties of paths in digraphs and 
graphs. 

10.2.1. Definitions 


In order to state the first of these properties, we need the following three defi- 
nitions: 
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Definition 10.2.1. Two vertices u and v of a multidigraph D are said to be 
adjacent if they are adjacent in the undirected graph D™4, (In other words, 
they are adjacent if and only if D has an arc with source u and target v or an 
arc with source v and target u.) 


Definition 10.2.2. An independent set of a multidigraph D means a subset 
S of V(D) such that no two elements of S are adjacent. In other words, it 
means an independent set of the undirected graph D™‘4, 


Definition 10.2.3. A path cover of a multidigraph D means a set of paths of 
D such that each vertex of D is contained in exactly one of these paths. 


Then, {(1, *,5,*,4), (3,*,2)} is a path cover of D (we are again writing aster- 
isks for the arcs, since the arcs of D are uniquely determined by their sources 
and their targets). Another path cover of D is {(1,*,3,*,4), (2), (5)}. Yet 
another path cover of D is {(1), (2), (3), (4), (5)}. There are many more. 

Note that the set {(1, *,5, *,4), (3, *,2,*,4)} is not a path cover of D, since 
the vertex 4 is contained in two (not one) of its paths. 

Let us draw the three path covers we have mentioned (by simply drawing 
the arcs of the paths they contain, while omitting all other arcs of D): 


OO 0O © 


{(1,*,5,*,4), (3,%,2)} | {CL *,3, *,4), (2), (5)} 10 (2), (3), (4), O)F 


Example 10.2.4. Let D be the following digraph: 


(Note that we have already seen path covers of a “complete” simple digraph 
(V, V x V) in the proof of Theorem [4.9.6} we called them “path covers of V”.) 
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Remark 10.2.5. Let D be a digraph. A path cover of D consisting of only 1 
path is the same as a Hamiltonian path of D. (More precisely: A single path 
p forms a path cover {p} of D if and only if p is a Hamiltonian path.) 


10.2.2. The Gallai—Milgram theorem 


Now, the Gallai-Milgram theorem states the following: 


Theorem 10.2.6 (Gallai-Milgram theorem). Let D be a loopless digraph. 
Then, there exist a path cover P of D and an independent set S of D such 
that S has exactly one vertex from each path in P (in other words, for each 
path p € P, exactly one vertex of p belongs to S). 


Example 10.2.7. Let D be the digraph from Example Then, Theorem 
[10.2.6] tells us that there exist a path cover P of D and an independent set S 
of D such that S has exactly one vertex from each path in P. For example, 
we can take P = {(1,*,5,*,4), (3,*,2)} and S = {5,3}. 


We will now prove Theorem [10.2.6] following Diestel’s book [Dieste17| The- 
orem 2.5.1]: 


Proof of Theorem [10.2.6] Write the multidigraph D as D = (V,A,@). We intro- 
duce a notation: 


e If P is a path cover of D, then a cross-cut of P means a subset S of V that 
contains exactly one vertex from each path in P. 


Thus, the claim we must prove is saying that there exist a path cover P of D 
and an independent cross-cut of P. 
We will show something stronger: 


Claim 1: Any minimum-size path cover of D has an independent 
cross-cut. 


Note that the size of a path cover means the number of paths in it. Thus, a 
minimum-size path cover means a path cover with the smallest possible num- 
ber of paths. 

We will show something even stronger than Claim 1. To state this stronger 
claim, we need more notations: 


e If P is a path cover, then Ends P means the set of the ending points of all 
paths in P. Note that |Ends P| = |P]. 


e A path cover P is said to be end-minimal if no proper subset of Ends P 
can be written as Ends Q for a path cover Q. 
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Example 10.2.8. For instance, if D is as in Example [10.2.4| and if 


P = {(1,*,5,*,4), (3,*,2)}, 
Q = {(1, *, 3, *,4), (2), (5)}, 
R =1{(1), (2), (3), (4), (5)} 


are the three path covers from Example [10.2.4 then 
EndsP = {4,2}, Ends Q = {4,2,5}, Ends R = {1,2,3,4,5},, 


which shows immediately that neither Q nor R is end-minimal (since Ends P 
is a proper subset of each of Ends Q and Ends R). It is easy to see that P is 
end-minimal (and also minimum-size). 


Back to the general case. Clearly, any minimum-size path cover of D is also 
end-minimaf™4 Thus, the following claim is stronger than Claim 1: 


Claim 2: Any end-minimal path cover of D has an independent 
cross-cut. 


It is Claim 2 that we will be proving] 

[Proof of Claim 2: We proceed by induction on |V]. 

Base case: Claim 2 is obvious when |V| = 0 (since Ø is an independent cross- 
cut in this case). 

Induction step: Consider a multidigraph D = (V, A, y) with |V| = N. Assume 
(as the induction hypothesis) that Claim 2 is already proved for any multidi- 
graph with N — 1 vertices. 


138 Proof. Let P be a minimum-size path cover of D. If P was not end-minimal, then there would 
be a path cover Q with |Ends Q| < |Ends P| and therefore |Q| = |Ends Q| < |Ends P| = 
|P|; but this would contradict the fact that P is minimum-size. Hence, P is end-minimal. 

1390n a sidenote: Is Claim 2 really stronger than Claim 1? Yes, because it can happen that 
some end-minimal path cover fails to be minimum-size. For example, the path cover 
{(1,*,2,*,3), (4)} in the digraph 


has this property. 
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Let P be an end-minimal path cover of D. We must show that P has an 
independent cross-cut. 

Let p1, Pz- - -, pk be the paths in P (listed without repetitions), and let v1,02,..., Uk 
be their respective ending points. Thus, {v1,v2,...,0,} = EndsP and k = 
|Ends P|. 

Recall that we must find an independent cross-cut of P. If the set {v1, v2, . . ., Vk} 
is independent, then we are done (since this set {v1, v2, . . . , Vg} is clearly a cross- 
cut of P). Thus, we WLOG assume that this is not the case. Hence, there is 
an arc from some vertex v; to some vertex Vj. These two vertices v; and vj are 
distinct (because D is loopless). Since we can swap our paths pj, p2,..., px (and 
thus their ending points v1, V2,...,0,) at will, we can thus WLOG assume that 
i = 2 and j = 1. Assume this. Thus, there is an arc from v2 to v1. We shall refer 
to this arc as the blue arc, and we will draw it accordingly 


We can extend the path p2 beyond its ending point v2 by inserting the blue 
arc and the vertex v4 at its end. This results in a new path, which we denote by 
p2 + v1; this path has ending point v1. 

If vı is the only vertex on the path p, (that is, if the path pı has length 0), then 
we can therefore replace the path p2 by p2 + vı and remove the length-0 path 
pi from our path cover P, and we thus obtain a new path cover Q such that 
Ends Q is a proper subset of Ends P. But this is impossible, since we assumed 
that P is end-minimal. Therefore, vı is not the only vertex on p1. 

Thus, let v be the second-to-last vertex on pı (that is, the vertex that is im- 
mediately followed by vı). Then, the path pı contains an arc from v to vı. We 


140This picture illustrates just one representative case, with k = 4. The four columns (from left 
to right) are the four paths pj, p2, ps, p4. Of course, the digraph D can have many more arcs 
than we have drawn on this picture, but we are not interested in them right now. 
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shall refer to this arc as the red arc, and we will draw it accordingly: 


Let D’ be the digraph D \ v; (that is, the digraph obtained from D by re- 
moving the vertex vı and all arcs that have vı as source or target). Let p| be 
the result of removing the vertex vı and the red arc from the path pı. Then, 
P' := {P1 P2 P3,---, pk} is a path cover of D’. Note that the path p| has 
ending point v (since it is obtained from pı by removing the last vertex and 
the last arc, but we know that the second-to-last vertex on pı is v), whereas 
the paths p2, p3,...,px have ending points v2,v3,...,0%.. Thus, Ends (P’) = 
{U,V2,03,...,0,}. Here is an illustration of the digraph D’ = D \ 7, and its 


path cover P”: 


Consider the path cover P’ of D’. If we can find an independent cross- 
cut of P’, then we will be done, because any such cross-cut will also be an 
independent cross-cut of our original path cover {pj,p2,...,px} = P. Since 
the digraph D \ 7; has N — 1 vertices] we can find such an independent 
cross-cut by our induction hypothesis if we can prove that the path cover P” is 
end-minimal (as a path cover of D’). 

So let us prove this now. Indeed, assume the contrary. Thus, D” has a path 
cover Q’ such that Ends ( Q’) is a proper subset of Ends (P’). Consider this Q’. 
Note that! 

Ends (Q’) Ç Ends (P') = {v,02,03,..., Ux} - 


because the digraph D has |V| = N vertices 
142The symbol “Ç” (note that the stroke only crosses the straight line, not the curved one) means 
“proper subset of”. 
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As a consequence, |Ends (Q’)| < |{v,v2,03,...,0x}| =k. 
Now, we are in one of the following three cases: 
Case 1: We have v € Ends (Q’). 
Case 2: We have v ¢ Ends (Q’) but v2 € Ends ( Q’). 
Case 3: We have v ¢ Ends (Q’) and v2 ¢ Ends (Q’). 
Let us consider these cases one by one: 


e We first consider Case 1. In this case, we have v € Ends (Q’). In other 
words, some path p € Q’ ends at v. Let us extend this path p beyond v by 
inserting the red arc and the vertex v, at its end. Thus, we obtain a path 
of D, which we call p + v1. Replacing p by p + vı in Q’, we obtain a path 
cover Q of D such that Ends Q is a proper subset of Ends P But 
this contradicts the fact that P is end-minimal. Thus, we have obtained a 
contradiction in Case 1. 


e Next, we consider Case 2. In this case, we have v ¢ Ends (Q') but v2 € 
Ends (Q’). Combining Ends (Q') C {v,v2,03,..., 0g} with v ¢ Ends (Q’), 
we obtain 


Ends (Q') C {v,02,03,..., 0x} \ {0} = {02,03,..., 0g}. 


From v € Ends (Q’), we see that some path p € Q’ ends at vo. Let us 
extend this path p beyond v2 by inserting the blue arc and the vertex v4 at 
its end. Thus, we obtain a path of D, which we call p + v1. Replacing p by 
p +v in Q’, we obtain a path cover Q of D such that Ends Q is a proper 
subset of Ends P But this contradicts the fact that P is end-minimal. 
Thus, we have obtained a contradiction in Case 2. 


143Proof. We obtained Q from Q’ by replacing p by p +7). As a consequence of this replacement, 
the ending point v of p has been replaced by the ending point vı of p + vı. Thus, 


EndsQ= (Ends (Q')\{0}) Uf{a} 
a 


C{02,03,..,0K} 
(since Ends(Q')C{v,02,03,...,0x}) 
C {v2,03,...,0¢} U {01} = {01, 02,..., Uk} = Ends P. 
For the same reason, we have |Ends Q| = |Ends(Q’)| < k = |Ends P|, so that Ends Q # 


Ends P. Combining this with Ends Q C Ends P, we conclude that Ends Q is a proper subset 
of Ends P. 


144 Proof. We obtained Q from Q' by replacing p by p + 7). As a consequence of this replacement, 
the ending point v2 of p has been replaced by the ending point vı of p + v1. Thus, 
EndsQ= (Ends (Q’) \ {vo}) U{01} 
S< —_— 


C{03,04,..-,PK} 
(since Ends(Q')C {v2,03,...,0K }) 


C {03,04,...,0¢} U {01} = {01, 03, 04,..., Uk} 
Ç {01,02,..., Uk} = Ends P. 


In other words, Ends Q is a proper subset of Ends P. 
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e Finally, we consider Case 3. In this case, we have v ¢ Ends(Q’) and 
v2 € Ends (Q’). Combining this with Ends (Q’) C {v,v2,03,...,0¢}, we 
obtain 


Ends (Q') C {v,02,03,...,U¢} \ {0,02} = {03,04,..., UK}, 


so that |Ends(Q’)| < |{v3,v4,...,0,%}| = k — 2. Now, adding the trivial 
path (vı) to Q’ yields a path cover Q of D such that Ends Q is a proper 
subset of Ends P But this contradicts the fact that P is end-minimal. 
Thus, we have found a contradiction in Case 3. 


So we have obtained a contradiction in each case. Thus, our assumption 
was false. This shows that the path cover P’ is end-minimal. As we already 
said above, this allows us to apply the induction hypothesis to D’ instead of 
D, and conclude that the end-minimal path cover P’ of D’ has an independent 
cross-cut. This independent cross-cut is clearly an independent cross-cut of P 
as well, and thus we have shown that P has an independent cross-cut. This 
proves Claim 2.] 


As explained above, this completes the proof of Theorem [10.2.6 O 


10.2.3. Applications 
Here are two simple applications of the Gallai-Milgram theorem: 


e Remember the Easy Rédei theorem (Theorem /4.10.6), which we proved 
long ago. It says that each tournament has a Hamiltonian path. 


We can now prove it again using the Gallai-Milgram theorem: 


New proof of the Easy Rédei theorem: Indeed, let D be a tournament. The 
Gallai-Milgram theorem shows that D has a path cover with an indepen- 
dent cross-cut!“ Consider this path cover and this cross-cut. But since 
D is a tournament, any independent set of D has size < 1. Thus, our 
independent cross-cut must have size < 1. Hence, our path cover must 
consist of 1 path only (because the size of the path cover equals the size 
of its cross-cut). But this means that it is a Hamiltonian path (or, more 
precisely, it consists of a single path, which is necessarily a Hamiltonian 
path). Hence, D has a Hamiltonian path. So we have proved the Easy 
Rédei theorem (Theorem /|4.10.6) again. 


145Proof. We obtained Q from Q’ by adding the trivial path (v1), whose ending point is v4. 
Thus, 


Ends Q = Ends (Q’) U {vi} (Œ {v3, U4,-- ., Ug} U {v1} = {v1, U3, U4,.-. ., Ug} 
Ve 


C{03,04,--,0K} 
G {v1, 02) 2+ ., UK} = Ends P. 


In other words, Ends Q is a proper subset of Ends P. 
146See the above proof of Theorem[I0.2.6]for the definition of a “cross-cut”. 
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e Less obviously, Hall’s Marriage Theorem (Theorem |8.3.4) and the Hall- 
König matching theorem (Theorem|8.4.7) can be proved again using Gallai- 
Milgram. Here is how: 


New proof of the Hall-Kénig matching theorem: Let (G, X,Y) be a bipartite 
graph. 

Let D be the digraph obtained from G by directing each edge so that it 
goes from X to Y (in other words, each edge with endpoints x € X and 
y € Y becomes an arc with source x and target y). Thus, in the digraph 
D, no vertex can simultaneously be the source of some arc and the target 
of some arc. Thus, any path of D has length < 1. Here is an illustration 
of a bipartite graph (G, X,Y) (drawn as agreed in Example and the 
corresponding digraph D: 


As we said, any path of D has length < 1. Thus, any path of D corre- 
sponds either to a vertex of G or to an edge of G (depending on whether 
its length is 0 or 1). Hence, any path cover P of D necessarily consists of 
length-0 paths (corresponding to vertices of G) and length-1 paths (cor- 
responding to edges of G); moreover, the edges of P (that is, the edges 
corresponding to the length-1 paths in P) form a matching of G, and the 
vertices of P (that is, the vertices corresponding to length-0 paths in P) 
are precisely the vertices that are not matched in this matching. 


Now, Theorem [10.2.6|shows that there exist a path cover P of D and an 
independent cross-cut S of P. Consider these P and S. For the purpose 
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of illustration, let us draw a path cover P (by marking the arcs in red) and 
an independent cross-cut S of P (by drawing each vertex s € S as a blue 
diamond instead of a green circle): 


oy 
RR 
D 
@ 


So Xe 


We have |S| = |XMS|+|YMS| (since the set S is the union of its two 
disjoint subsets X N S and YMS). 


The set S is an independent set of the digraph D, thus also an independent 
set of the graph D™4 = G. From this, we easily obtain N(XNS) CY\S 
(since (G, X,Y) is a bipartite graph Therefore, |N(XMS)| < |Y\ S|, 
so that |Y \ S| > |N (XA S)|. Hence, 


IYI = |Y\S| +]Y¥YNS| > |N(XnS)|+ lYns| 
——” ——” 
>|N(XNS)| =|5|—|XNS| 
(since |S|=|XNS|+|YNS]) 
= |N(XNS)|+|S|—|Xn S|. (55) 


Now, let M be the set of edges of G corresponding to the length-1 paths 
in our path cover P. As we already mentioned, this set M is a matching 


147 Proof. Let v € N (XN S). Thus, v is a vertex with a neighbor in XN S. Let x be this neighbor. 

Then, x € XMS C X, so that the vertex v has a neighbor in X (namely, x). Since (G, X,Y) is 

a bipartite graph, this entails that v € Y. Furthermore, we have x € XMS C S. If we had 

v € S, then the set S would contain two adjacent vertices (namely, v and x), which would 

contradict the fact that S is an independent set of G. Thus, we have v ¢ S. Combining v € Y 
with v ¢ S, we obtain v € Y \ S. 

Forget that we fixed v. We thus have shown that v € Y \ S for each v € N (XN S). In 
other words, N (XN S) CY\S. 
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of G (since two paths in P cannot have a vertex in common). The vertices 
that are not matched in M are precisely the vertices that don’t belong to 
any of the length-1 paths in P; in other words, they are the vertices that 
belong to length-0 paths in P (since P is a path cover, and any path has 
length < 1). We let p be the number of such vertices that lie in X, and we 
let q be the number of such vertices that lie in Y. 


Thus, our path cover P contains exactly p + q length-0 paths: namely, p 
length-0 paths consisting of a vertex in X and q length-0 paths consisting 
of a vertex in Y. Hence, the path cover P consists of |M| + p + q paths 
in total (since it contains |M| many length-1 paths). The set S contains 
exactly one vertex from each of these |M| + p +q paths (since S is a cross- 
cut of P); therefore, 

[S| = |M| +p +4. (56) 


Each vertex y € Y that is matched in M belongs to exactly one M-edge 
(namely, to its M-edge), and conversely, each M-edge contains exactly one 
vertex in Y (which, of course, is matched in M). Thus, the map 


{vertices in Y that are matched in M} —> M, 
y ++ (the M-edge of y) 


is a bijection. Hence, the bijection principle yields 


(# of vertices in Y that are matched in M) = |M]. (57) 


On the other hand, the set Y contains exactly q vertices that are not 
matched in M (by the definition of q). Therefore, Y contains exactly |Y| — g 
vertices that are matched in M. In other words, 


(# of vertices in Y that are matched in M) = |Y| — q. 


Comparing this with (67), we obtain |M| = |Y| — g. In other words, 
[IM] +4 = |Y]. (58) 
The same argument (but applied to X and p instead of Y and q) yields 


|M| +p = |X|. (59) 
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Now, from (58), we obtain 
IM| +4 = [Y| 
2|N(XNS)|[+ [S| -|XNS| (by (5) 
=|M|+p+ 
(by GB) | 
= |N(xnS)|+ |M|+p+q-—|XNS| 
= |M|+ p+ |N(XNS)|—-|XNS|+q 
—— 


=|X 
(by (59) 


= |X|+|N(XNS)|-|XNS|+4q. 


Cancelling q, we obtain 


|M| > |X|+ |N(XNS)|—|XnS| 
= |N(XNS)|+ [|X| -—|xXNS]. (60) 


Thus, we have found a matching M of G and a subset U of X (namely, 
U = XN S) such that |M| > |N (U)| + |X| — |U|. This proves the Hall- 
König matching theorem (once again). 


New proof of Hall’s Marriage Theorem: Proceed as in the proof of the Hall- 
König matching theorem that we just gave. But now assume that our bi- 
partite graph (G, X,Y) satisfies the Hall condition (i.e., we have |N (A)| > 
|A| for each subset A of X). Hence, in particular, |N (XN S)| > |X S|. 
Therefore, becomes 


IMI > |N(XNS)| +|X|— [XS] > |X]. 
— am 


>|Xns| 


Hence, Proposition (e) shows that the matching M is X-complete. 
Thus, G has an X-complete matching (namely, M). This proves Hall’s 
Marriage Theorem (once again). 


Exercise 10.4. Let c and r be two positive integers. Let T be a tournament 
with more than r° vertices. Each arc of T is colored with one of the c colors 
1,2,...,c. Prove that T has a monochromatic path of length r. 

(A path is said to be monochromatic if all its arcs have the same color.) 


[Hint: Induct on c, and apply Gallai-Milgram to a certain digraph in the 
induction step.] 


Remark 10.2.9. If we apply Exercise to c = 1, then we recover the easy 
Rédei theorem (Theorem 4.10.6). Indeed, if T is any tournament, then we can 
color all its arcs with the color 1, and then use Exercise (applied toc = 1 
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and r = |V(T)| — 1) to conclude that T has a monochromatic path of length 
IV (T)|— 1. But such a path must necessarily be a Hamiltonian path (since 
its length forces it to contain all vertices of T). 


10.3. Path-missing sets 


We move on to less well-trodden ground. 

Menger’s theorem (one of the many) is from 1927; the Gallai-Milgram theo- 
rem is from 1960. One might think that everything that can be said about paths 
in graphs has been said long ago. 

Apparently, this is not the case. In 2017, when trying to come up with a 
homework exercise for a previous iteration of this course, I was experimenting 
with paths in Python. Specifically, I was looking at digraphs D = (V, A, y) with 
two distinct vertices s and t selected. Inspired by the arc-Menger theorems, I 
was looking at the subsets B of A that could be removed without disconnecting 
s from t (more precisely, without destroying all paths from s to t). I noticed 
that the number of such subsets B seemed to be even whenever D has a cycle 
or a “useless arc” (i.e., an arc that is used by no path from s to t) and odd 
otherwise. 

I could not prove this observation. Soon after, Joel Brewster Lewis and Lukas 
Katthan came up with a proof and multiple stronger results. The proofs can 
now be found in a joint preprint [GrKaLe21], although I believe that they are 
far from optimal (this is one reason we have not submitted the preprint to a 
journal yet). 

The first way to strengthen the observation is to replace the parity claim (i.e., 
the claim that the number is even or odd depending on cycles and useless arcs) 
by a stronger claim about an alternating sum. This is an instance of a general 
phenomenon, in which a statement of the form “the number of some class of 
things is even” can often be replaced by a stronger statement of the form “we 
can assign a plus or minus sign to each of these things, and then the total 
number of plus signs equals the total number of minus signs”. The stronger 
statement is as follows: 


Theorem 10.3.1 (Grinberg—Lewis—Katthan). Let D = (V, A, y) be a multidi- 
graph. Let s and t be two distinct vertices of D. A subset B of A will be called 
path-missing if D has a path from s to t that does not use any of the arcs in 
B (that is, a path from s to t that would not be destroyed if we remove all 
arcs in B from D). (In the terminology of Definition [10.1.3] this is the same 
as saying that B is not an s-t-arc-separator.) 

Let M be the set of all path-missing subsets of A. 


(a) If D has an arc that is not used by any path from s to t (this is what we 


M8 With one exception: If A = Ø, then it is odd. 
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call a “useless arc”), then 


re aoe! 
BEM 


(and thus |M] is even). 


(b) If D has a cycle, then 
(m =o 
BEM 
(and thus |M] is even). 


(c) If A = Ø, then 
Ha 
BEM 


(and thus |M] is even). 


(d) In all other cases, we have 


3 (—1)Pl = (OMR; 


BEM 


where V’ is the set of all vertices of D that have outdegree > 0 (and 
thus |M| is odd). 


Example 10.3.2. Let D = (V, A, ọ) be the following digraph: 


Let s and t be the vertices labelled s and t here. Then, D has neither a cycle 
nor a “useless arc”, and its arc set A is nonempty; thus, Theorem [10.3.1] (d) 
applies. The path-missing subsets of A are the three sets {a,b,c,d}, {c,e} 
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and {d,e, f } as well as all their subsets (such as {b,c,d}). In other words, 


M = {all subsets of {a,b,c,d}}U {all subsets of {c,e}} 
U {all subsets of {d,e, f}} 
= {2, {a}, {b}, {c}, {4}, {ab}, {ac}, tad}, {b,c}, {b,d}, 
{c,d}, {a,b,c}, {a,b,d}, {a,c,d}, {b,c,d}, {a,b,c,d}, 
{e}, {ce}, {f}, thet, {e f}, 1f}, the ft. 


Hence, the sum Z Ehi has 11 addends equal to —1 and 12 ad- 
B 


€M 
dends equal to 1; ae this sum equals to 1. This is precisely the value 


jen = (-1)° * = 1 predicted by Theorem [10.3.1] (d). 


Proof of Theorem [10.3.1] See Theorem 1.3] (where M is denoted by 
PM (D), and where arcs are called “edges”). Of course, part (c) is obvious, 
and part (a) is easy (since inserting a useless arc into a set B € M or removing 
it from a set B € M always results in a set in M). Parts (b) and (d) are the 
interesting ones. The proof in Theorem 1.3] relies on a recursive 
argument (“deletion-contraction”) in which we pick an arc with source s and 
consider the two smaller digraphs D \ a and D 7 a obtained (respectively) by 
deleting the arc a from D and by “contracting” a “to a point”. 


Further levels of strength can be reached by treating M as a topological 
space. Indeed, M is not just a random collection of sets of arcs, but actually 
a (since any subset of a path-missing subset of A is again 
path-missing). Simplicial complexes are known to be a combinatorial model 
for topological spaces, and in particular they have homology groups, homo- 
topy types, etc.. Thus, in particular, we can ask ourselves how the topological 
space corresponding to the simplicial complex M looks like. This, too, has been 
answered in Theorem 1.3]: It is homotopic to a sphere or a ball 
(depending on the existence of cycles or “useless arcs”); its dimension can also 


be determined explicitly. (The sum }, Sha discussed above is, of course, 
BEM 


its reduced Euler characteristic.) 


10.4. Elser’s sums 


We now return to undirected (multi)graphs. Here is a result found by Veit Elser 
in 1984 ([Elser84, Lemma 1]), as a lemma for his work in statistical mechanics{!49] 


1491 have restated the result beyond recognition; see [Grinbe21] Remark 1.4] for why Theorem 
10.4.1jactually implies |Elser84} Lemma 1]. 
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Theorem 10.4.1 (Elser’s theorem, in my version). Let G = (V,E,g) be a 
multigraph with at least one edge. Fix a vertex v € V. 

If F C E, then an F-path shall mean a path of G such that all edges of this 
path belong to F. In other words, it means a path of the spanning subgraph 
(V,F,@ |r). 

If e € E is an edge and F C E is a subset, then we say that F infects e 
if there exists an F-path from v to some endpoint of e. (The terminology 
is inspired by the idea that some infectious disease starts at v and spreads 
along the F-edges.) 

(Note that if an edge e contains the vertex v, then any subset F of E (even 
the empty set) infects e, because (v) is a trivial F-path from v to v.) 

Then, 

(-1)! =0. 


FCE infects 
every edge ecE 


Example 10.4.2. Let G = (V,E, p) be the following graph: 


and let v be the vertex labelled v. Then, the subsets of E that infect every 
edge are 


{1,2}, {1,4}, {3,4}, {1,2,3}, {1,3,4}, {1,2,4}, {2,3,4}, {1,2,3,4}. 
Thus, 


2i 
FCE infects 
every edge e€E 


= (-1)° + (-1)? + (-1)? + (11)? + (-1)° + (-1)° + (-1)° + (-1)? 
= 0, 


exactly as predicted by Theorem [10.4.1 


Remark 10.4.3. It might appear more natural to study subsets F C E infecting 
vertices rather than edges. However, Theorem [10.4.1] would be false if we 
replaced “every edge e € E” by “every vertex v € V”. The graph in Example 
[10.4.2] provides a counterexample. 
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However, if we go further and replace F C E by W C V, then we get 
something true again — see Theorem [10.4.4] below. 


Proof of Theorem [10.4.1] Elser’s proof is somewhat complicated. I give a differ- 
ent proof in Theorem 1.2], which is elementary and nice if I may say 
so myself. 

My proof should also be not very hard to discover, once you have the follow- 
ing hint: It suffices to prove the equality 


a! 
FCE does not infect 
every edge ecE 


(because the total sum $ (=j F is known to be 0). In order to prove this 
FCE 
equality, we equip the set E with some total order (it doesn’t matter how; we 
can just rank the edges arbitrarily), and we make the following definition: If 
F C E is a subset that does not infect every edge e € E, then we let e (F) be 
the smallest (with respect to our chosen total order) edge that is not infected 
by F. Now, you can show that if F C E is a subset that does not infect every 
edge e € E, then the sef] F := FA {e (F) } (that is, the set obtained from F 
by inserting e (F) if e (F) ¢ F and by removing e (F) if e (F) € F) has the same 
property (viz., it does not infect every edge e € E) and satisfies e (F') = e (F). 


This entails that the addends in the sum D ela cancel each 


FCE does not infect 
every edge ecE 


other in pairs (namely, the addend for a given set F cancels the addend for the 
set F’ = F A {e(F)}), and thus the whole sum is 0. O 


Elser’s theorem, too, can be generalized and strengthened. The strength- 
ening is similar to what we did with Theorem We treat the set of all 
“non-pandemic-causing subsets” (i.e., of all subsets F C E that don’t infect ev- 
ery edge) as a simplicial complex (since a subset of a non-pandemic-causing 
subset is again non-pandemic-causing), and analyze this complex as a topo- 
logical space. The claim of Theorem then says that the reduced Euler 
characteristic of this space is 0; but we can actually show that this space is con- 
tractible (i.e., homotopy-equivalent to a point). Even better, we can prove that 
the simplicial complex of all non-pandemic-causing subsets is collapsible (a 
combinatorial property that is stronger than contractibility of the correspond- 


ing space). See [Grinbe21) §5] for definitions and proofs. 


150The symbol A stands for the symmetric difference of two sets. Recall its definition: If X and 
Y are two sets, then their symmetric difference X A Y is defined to be the set 


(XUY)\(XAaAY)=(X\ Y uU(Y\X) 
= {all elements that belong to exactly one of X and Y}. 
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We can furthermore generalize the theorem. One way to do so is to replace 
our “patient zero” v by a set of vertices. This leads to a much less trivial 
situation. The recent paper by Dorpalen-Barry, Hettle, Livingston, 
Martin, Nasr, Vega and Whitlatch proves some results and asks some questions 
(that are still open as of 2022). 

A different direction in which Elser’s theorem can be generalized is more fun- 
damental: It turns out that the theorem is not really about graphs and edges. 
Instead, there is a general structure that I call a “shade map”, which always 
leads to a certain sum being 0. See §4] for the details of this gen- 
eralization. I will not explain it here, but I will state one more particular case 
of it ([Grinbe21} Theorem 3.2]), which replaces edges by vertices throughout 
Theorem [10.4.1 


Theorem 10.4.4 (vertex-Elser’s theorem). Let G = (V,E,¢) be a multigraph 
with at least two vertices. Fix a vertex v € V. 

If W C V, then a W-vertex-path shall mean a path p such that all interme- 
diate vertices of p belong to W. (Recall that the “intermediate vertices of p” 
mean all vertices of p except for the starting and ending points of p.) (Note 
that any path of length < 1 is automatically a W-vertex-path, since it has no 
intermediate vertices.) 

If w € V \ {v} is any vertex, and W C V \ {v} is any subset, then we say 
that W vertex-infects w if there exists a W-vertex-path from v to w. (This is 
always true when w is a neighbor of v.) 


Then, 
(—1)/I — 0. 
WCV\{v} vertex-infects 
every vertex wE V\{v} 
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